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(54) Title: COMPOUND SCREENS RELATING TO INSULIN DEFICIENCY OR INSULIN RESISTANCE 

(57) Abstract: The invention is concerned with use of the model organism C. elegans as a research tool to screen for compounds 
active in insulin signalling. In particular, the invention relates to improved screening methods based on release of C. elegans from 
the dauer larval slate. 
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COMPOUND SCREENS RELATING TO INSULIN DEFICIENCY OR 
INSULIN RESISTANCE 

The present invention is concerned with using the 
model organism C. elegans as a research tool to 
effectively screen compound libraries for compounds 
active in insulin signalling, in particular compounds 
which act downstream of the insulin receptor. 
Specifically the invention relates to improved 
screening methods based on release of C. elegans from 
the dauer larval state. 

In a particular embodiment, the invention 
provides improved screening methods using C. elegans 
carrying mutations in one or more gene(s) involved in 
the. insulin signalling pathway, such as the Daf -genes, 
in one particular embodiment, (at least one of) said 
mutation (s) is in the daf -2 gene, which is homologous 
to the insulin receptor subfamily of receptor tyrosine 
kinases. One the basis of the homology between daf -2 
and the insulin receptor subfamily it is proposed that 
worms mutant in the daf-2 gene may serve as models for 
insulin-related diseases and disease risks, as for 
example diabetes mellitus, obesity, insulin resistance 
and impaired glucose tolerance (Kimura et al. 1997, 

Science 277, 942-946) . 

General techniques and methodology for performing 
in vivo assays using the nematode worm Caenorhabditis 
elegans (C. elegans) as a model organism have been 
described in the art, most notably in the following 
applications by applicant: PCT/EP99/09710 ( published 
on 15 June 2000 as WO 00/34438) ; PCT/EP99/04718 
(published on January 15, 2000 as WO/00/01846); 
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PCT/IB00/00575 (published on October 26, 2000 as WO 
00/63427); PCT/IB00/00557 (published on October 26, 
2000 as WO 00/63425); PCT/IB00/00558 (published on 
October 26, 2000 as WO 00/63426); as well as for 

5 instance PCT/US98/10080 (published on 19-11-1998 as WO 
98/51351), PCT/US99/13650, PCT/US99/01361 (published 
on 29-07-1999 as WO99/37770) , and PCT/EP00/05102 . 

As described in these applications, one of the 
main advantages of assays involving the use of C. 

10 elegans is that such assays can be carried out in 
multi-well plate format (with each well usually 
containing a sample of between 2 and 100 worms) and - 
also because of this - may also be carried out in- an 
automated fashion, i.e. using suitable robotics (as 

15 are. described in the aforementioned applications 

and/or as may be commercially available) . This makes 
assays involving the use of C. elegans ideally suited 
for screening of libraries of chemical compounds, in 
particular at medium to high throughput. Such 
•20 automated screens may for instance be used in the 

discovery and/or development of new compounds (e.g. 
small molecules) for pharmaceutical, veterinary or 
agrochemical/ pesticidal (e.g. insecticidal and/or 
nematocidal) use. 

25 Some other advantages associated with the use of 

C. elegans as a model organism (e.g. in the assay 
techniques 1 referred to above) include, but are not 
limited to: 

30 - C. elegans has a short life-cycle of about 3 days. 

This not only means that these nematodes (and suitable 
mutants, transgenics and/or stable lines thereof) can 
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be cultivated/generated quickly and in high numbers, 
but also allows assays using C.elegans to test, in a 
relatively short period of time and at high 
throughput, the nematode worms over one or more, and 

5 up to all, stages of life/development, and even over 
one or more generations. Also, because of this short 
life span, in C.elegans based-assays, compounds may be 
tested over one or more, and up to essentially all, 
stages of development, without any problems associated 

10 with compound stability and/or (bio) availability; 

- C. elegans is transparant, allowing -with advantage- 
for visual or non-visual inspection of internal organs 
and internal processes, and also the use of markers 

15 such as fluorescent reporter proteins, even while the 
worms are still alive. Also, as further mentioned 
below, such inspection may be carried out in automated 
fashion using suitable equipment such as plate 
readers; 

20 

- C.elegans is a well-established and well- 
characterized model organism. For example, the genome 
of C.elegans has been fully sequenced, and also the 
complete lineage and cell interactions (for example of 

25 synapses) are known. In addition, C.elegans has full 
diploid genetics, and is capable of both sexual- 
reproduction (e.g. for crossing) as well as 
reproduction as a self-fertilizing hermaphrodite. All 
this may provide many advantages, not only for the use 

30 of C.elegans in genetic and/or biological studies, but 
also for the use of C.elegans in the discovery, < . 
development and/or pharmacology of (candidate) drugs 
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for human or animal use. 

- Techniques for transforming, handling, cultivating, 
maintaining and storing (e.g. as frozen samples, which 
5 offers great practical advantages) C. elegans are well 
established in the art, for instance from the 
handbooks referred to below. For example, C. elegans 
may be used as one or more samples with essentially 
fully isogenic genotype (s). 

10 

Generally, in the assays described above, the 
nematodes are incubated in suitable vessel or 
container - such as a compartment or well of a multi- 
well plate - on a suitable medium (which may be a 

15 solid, semi-solid, viscous or liquid medium, with 

liquid and viscous media usually being preferred for 
assays in multi-well plate format) . The nematodes are 
then contacted with the compound(s) to be tested, e.g. 
by adding the compound to the medium containing the 

20 worms. After a suitable incubation time (i.e. 

sufficient for the compound to have its effect - if 
any - on the nematodes) , the worms are then subjected 
to a suitable detection technique, i.e. to 
measure/determine a signal that is representative for 

25 the influence of the compound (s) to be tested on the 
nematode worms, which may then be used as a measure 
for. the activity of the compound (s) in the in vivo 
assay. 

Often, in particular for automated assays, such a* 
30 detection technique involves a non-visual detection 
method (as further described in the applications 
mentioned above), such as measurement of fluorescence 
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or another optical method, measurement of a particular 
marker (either associated with worms or associated 
with the medium) such as autonomous, fluorescent 
proteins (AFP's) such as green fluorescent proteins 

5 (GFP's), aequorin, alkaline phosphatase, luciferase, 
Beta-glucoronidase, Beta- lactamase, Beta- 
galactosidase, acetohydroxyacid, chloramphenicol 
acetyl transferase, horse radish peroxidase, nopaline 
synthase, or octapine synthase. For example, for 

10 automated assays carried out in multi-well plates, so 
called (multi-well) "plate readers" may be used for 
detecting/measuring said signal. 

For a further description of the above and other 
assay techniques involving the use of nematodes as a 

15 model organism, reference is made to the prior art, 
such as the applications by applicant referred to 
above . 

For general information on C.elegans and 
techniques for handling this nematode worm, reference 
20 .is made to the standard handbooks, such as W.B. Wood 
et al., "The nematode Caenorhabditis elegans" , Cold 
Spring Harbor Laboratory Press (1988) and D.L. Riddle 
et al., XN C. ELEGANS 11", Cold Spring Harbor Laboratory 
Press (1997) . 

25 The use of C.elegans based assays in the field of 

metabolic diseases - such as obesity and diabetes - 
has been described in a number of applications, most 
notably in PCT US 98/10800 and US-A-6, 225, 120, which 
relate to the use of daf-2 mutant C. el egans nematodes . 

30 for selecting compounds active in impaired glucose, 
tolerance and diabetes, as a model for insulin 
resistance. ; 
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One of the main objects of the present invention 
is to provide improved methods for the selection of 
compounds for the field of metabolic diseases - 
including but not limited to obesity, impaired glucose 

5 tolerance and type-II diabetes - which methods may be 
used for drug discovery, development, pharmacology and 
- testing. In particular, it is an object of the 
invention to provide such improved assays as compared 
to the assay techniques described in PCT US 98/10800 

10 and US-A-6, 225,120.. 

Generally, the invention solves this problem by 
the use, in such assays, of nematode strains (such as 
m41) which have increased sensitivity of the insulin 
signalling pathway compared to the strains used in PCT 

15 US ,98/10800 and US-A-6, 225, 120. 

Diabetes mellitus is a major growing public 
health problem in both developed and developing 
countries. Including clinical complications it 
accounts for 5% of the total healthcare expenditure in 

20 Europe. Depending on the type of diabetes, current 
drug- therapy strategy for diabetes consist of a diet 
supported by either application of exogenous insulin 
of different origin, application of drugs that 
increase production and/or release of endogenous 

25 insulin, enhance sensitivity of peripheral organs to 
insulin or mimic insulin effects. Drugs acting 
directly in the insulin pathway downstream of the 
receptor are potentially beneficial in both major 
types of diabetes but they are not existing today. 

30. The major drawback of currently available drugs is the 
body weight gain that comes on top of an existing 
obesity in the vast majority (80%) of patients. This 
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side effect is also the main reason why 
pharmacological intervention in the middle range of 
disease development is not as intense and aggressive, 
as it should be to achieve optimal efficacy. New 
5 drugs that are devoid of this side effect would 
already reduce risk of complications by 12 to 30% 
(United Kingdom prospective diabetes study. Turner et 
al. 1998, BMJ 316: 823-828; Turner et al. 1999, JA^IA 
281: 2005-2012) . 

10 Novel glitazones, such as troglitazone, that act 

on nuclear receptors which regulate carbohydrate 
metabolism that have been launched in Japan and the US 
were withdrawn due to an elevated risk of liver 
toxicity. Hence the medical need for well tolerated 

15 orally-active anti-diabetics with mild benign 

side-effects remains high. A compound that directly 
interacts downstream the insulin receptor pathway 
could establish a breakthrough especially since it 
could be a drug that acts both in Type I and Type II 

20 diabetes. A compound that has as a clinical result an 
insulin sparing effect could also be of extremely high 
therapeutic value. 

From animal studies inorganic vanadates are known 
to favourably combine increase in insulin sensitivity 

25 and reduction of • hyper lipidemia together with body 
weight stability or loss, but are devoid of body 
weight gain (Brichard and Henquin 1995, TIPS 16: 
265-270) . Due to unresolved toxicity issues, however, 
they are not available in drug formulas. Although 

30 inorganic vanadium compounds are currently in clinical 
trial,' the issue of side effects still raises doubts 
for this class of compounds to have to specification 
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of a drug, which has to be well tolerated in multiple 
doses per day for decades. 

Nevertheless, the recognition of protein tyrosine 
phosphatase IB as the major target of vanadates and 

5 the validation of this target as strongly increasing 
insulin sensitivity when inactivated in mice points 
towards the insulin receptor pathway as valuable for 
finding active compounds to ameliorate insulin 
resistance (Elchebly et al . 1999, Science 283: 

10 . 1544-1548) . PTP-1B is a negative regulator of insulin 
receptor tyrosine phosphorylation and kinase activity, 
its inactivation is raising insulin signalling with' 
given constant insulin levels (Figure 1) . The present 
inventors have shown that vanadates can rescue the 

15 genetic insulin resistance caused by daf-2 mutations 
in CaenorhaJbditis elegans, thereby validating the 
genetic model for insulin-deficient and 
insulin-resistant related disease by pharmacological 
means (Figure 3) . Wortmannin, an inhibitor of the 

20 downstream effector phosphatidyl-inositol-3-phosphat 
kinase (Figure 1), further increases insulin 
resistance, confirming the sensitivity of the invented 
assay for the pathway (Figure 4). The possible known 
targets in the insulin-receptor pathway shown in 

25 Figure 1 are listed in table 1. 

The inventors have made two key adaptations which 
enable them to use C. elegans mutant strains to 
effectively screen large compound libraries for 
activities mimicking vanadates using screens based on 

30 rescue of the phenotype dauer formation and other 

phenotypic traits which are caused by interventions in 
the insulin signalling pathway, such as, for example, 
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mutations in the insulin receptor gene homologue daf- 
2. The first adaptation is the use of C. elegans with 
a sensitized genetic background; the second adaptation 
is manipulation of the assay conditions such that a 

5 basal level of release from the dauer larval state is 
present even in the absence of test compounds. The 
daf-2 gene had previously been disregarded as useful 
target for compound screens due to a failure of 
obtaining active compounds from large compound 

10 libraries (Carl Johnson, Axys pharmaceuticals, 

Nemapharm division, disclosed at the Cold Spring 
Harbor worm course). The new developments described 
herein overcome sensitivity problems previously * 
encountered with screens based on daf-2. 

IS In the invention, generally nematode strains are 

used that show sensitivity of the insulin signalling 
pathway. 

In particular, these strains are used in assays 
involving the use of a dauer stage and/or dauer. 

20 phenotype as a read out. These may for instance be 
assays based on "dauer rescue" and/or on "dauer 
formation/bypass" (of which dauer bypass is usually 
preferred, as it may avoid the problems associated 
with the limited uptake of the compound (s) to be 

25 tested by worms in the dauer state). 

In the former type of assay, a sample of worms in 
the dauer state is provided, and the efficacy of the 
compound(s) to be tested in bringing the worms of said 
sample out of the dauer state is determined.' 

30 Generally, compounds with the desired activity will 
bring the worms out of the dauer state (i.e. to a 
greater degree than a reference without compound, and 
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preferably in a dose/concentration-dependant manner) 
and thus provide adults (i.e. more adults than without 
the presence of the compound (s) to be tested) . 

In the latter type of assay, a sample of worms 

5 (in particular eggs, LI or 12 worms, and preferably Ll 
worms) is kept under conditions which, without the 
presence of any compound (s) to be tested, would cause 
(most and preferably essentially all) of the worms, in 
the sample to enter the dauer state, and the efficacy 

10 of the compound (s) to be tested in preventing the 
worms, under these conditions, to enter the dauer 
state (i.e. to bypass the dauer state) is determined. 
Generally, compounds with the desired activity will 
prevent the worms from entering the dauer state (i.e. 

15 to ,a greater degree than a reference without compound, 
and preferably in a dose/concentration-dependant 
manner) and thus provide adults (i.e. more adults than 
without the presence of the compound (s) to be tested, 
and preferably in a dose-dependant manner) . Conditions 

20 such that the worm strain (s) used will enter the dauer 
state without the presence of the compound (s) to be 
tested will depend on the specific worms strain used 
and will be clear to the skilled person, also in view 
of the preferred conditions described hereinbelow. 

25 Also, these conditions are preferably such that, under 
the conditions of the assay, a reference compound with 
the desired activity (such as vanadate at a 
concentration of between 0.5 and 2 milliMolar) will 
allow a measurable amount of worms to bypass the dauer 

30 state (e.g. between 40 to 70%, or even more). If 

necessary, the results obtained with such a reference 
compound may also serve as a positive control or 
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comparative reference for the compound ( s ) to be 
tested. 

As will be clear to the skilled person, for both 
the dauer rescue and the dauer bypass assays described 
5 above, and during or at the end of the assay, either 
the number of dauer larvae in the sample and/or the 
number of .adults may be determined (with the sum of 
the number of dauer larvae and the number of adults 
being essentially equal to the number of worms present 

10 in the original sample) . Techniques for determining 
the number of adults and/or dauer larvae in a sample 
will be clear to the skilled person and may include 
visual inspection of the sample (e.g. counting) as 
well as the automated non-visual detection techniques 

15 referred to above. 

In the context of the present invention, the 
insulin signalling pathway may generally be described 
in all enzymatic conversions and other signal 
transduction events that are involved in 

20 (transmembrane) receptor-mediated (cellular) signal 
transduction in response to the (extracellular) 
presence insulin signals (e.g. the extracellular . 
presence of insulin or insulin-like compounds) . Some 
of the most important (but non-limiting) examples of 

25 the different enzymatic conversions involved in said 
signalling have already been mentioned hereinabove. 

By "sensitivity of the insulin signalling 
pathway"' is generally meant that 
1) the nematode shows one or more biological 

30 response (s) to the presence of an insulin, to the 

presence of an insulin-like compound, and/or to the 
presence of- a compound that can provide and/or or 
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mimic a biological response similar to the 
biological response (s) provided by insulin or the 
insulin-like molecules (which three categories are 
also collectively referred to herein as "insulin- 
5 like signals") ; and that 

2) said one or more biological responses change when 
(the amount of) the compound (s) to which the 
nematode is exposed (and/or with which said 
nematode comes into contact) changes or is altered 
0 (for instance, due to a change in the concentration 

of said insulin like signal in the medium. 

The biological response may be any response or 
combination of responses, such as one or more changes 
in physiology, biochemistry, development, behaviour, 
5 exi.tation, or other phenotypical properties. 

In one particularly preferred embodiment, these 
may essentially be one or more of the biological 
responses that are (also) obtained upon 
(over) expression of insulin the nematode. 
0 One particularly suited biological response may 

be the dauer-behaviour, e.g. the entry, exit, rescue 
or bypass of the dauer state, and/or other 
phenotypical properties that result from and/or are 
associated with the so-called dauer decision. 
5 In the invention, (one or more strains of) 

nematodes are used that show increased sensitivity of 
the insulin pathway, compared to at least the 
wildtype, and preferably also compared to the 
reference strain CB1370 (containing the daf-2 
0 reference mutation el370. This strain is publicly 
available, for example from the Caenorhabditis 
Genetics Center (CGC) , Minnesota, USA) . 
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By "increased sensitivity of the insulin 
signalling pathway 1 ' is generally meant that the change 
in the biological response of the nematode (as 
described above) to a change in (the concentration of) 
5 the insulin-type signal is greater than the change 

that v is obtained with the wildtype and/or CB1370 (i.e. 
. for the same change. in (the concentration of) the 
insulin-type signal) . 

For example, when a change in (e.g. an increase 

10 or reduction of) the concentration of an insulin-type 
signal gives, for the wildtype and/or CB1370, a change 
in (e.g. an increase or reduction of) the biological 
response of by a factor of x, than the same change 
will give, for a strain suitable for use in the 

15 invention, a change in the same biological response of 
more than x (e.g. 1.05 times x, preferably 1.1 times 
x, more preferably 1.5 times x or even 2 times x or 10 
times x, depending on the biological response, the 
insulin-type signal, the change in concentration, and 

20 the specific strain (s) used). In case there is no 
change observed in wildtype and/or the reference 
strain CB1370, any change observed determines a strain 
to be of "increased sensitivity to a insulin-type 
signal". 

25 ' For example, an "insulin-type signal" as used 

herein may be: 

- an insulin or insulin-like molecule (e.g. from any 
suitable source, including but not limited to ; 
nematodes, humans or other animals) , for which 
30 reference is made to PCT/US99/08522, published as 

W099/54436 on 28.10.99; Genes & Development 15 : 672- 
686,2001; • 
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- a vanadate or a vanadate-type compound, such as 
. sodium orthovanadate; 

- a PTB-1B inhibitor such as described in Journal of 
Medicinal Chemistry 43:1293-1310,25.02.2000, for 

5 example compound 66; 

- wortmannin or a wortmannin-type compound, such as 
LY 294002 or other PI3-kinase inhibitors. 

In this respect, it, should be noted that an 
increase in the concentration of an insulin-type 

10 signal may provide an increase in the biological 
response (in which said increase will be more 
pronounced for the strain of the invention than for 
the wildtype and/or for CB1370) , or may provide a* 
decrease in the biological response (in which said 

15 decrease will be more pronounced for the strain of the 
invention than for the wildtype and/or for CB1370) . 
For example, an increase in the concentration of a 
wortmannin will provide an increase in the biological 
response (for example more dauer) , which will be even 

20 more pronounced for the strains of the invention (e.g. 
even more dauer compared to wildtype/CB1370 per 
increased concentration of wortmannin) , whereas an 
increase in the concentration of a vanadate will 
provide a decrease in the biological response (for 

25 example less dauer) , which will be even more 

pronounced for the strains of the invention (e.g. even 
less dauer 1 compared to wildtype/CB1370 per increased 
concentration of vanadate) . In case the number of 
nematodes grown up, i.e. non-dauer, are counted, 

30 positive, (i.e. increased) and negative ■ (i . e . 

decreased) biological response are reversed into each 
other. Both types of insulin-type signals' may be used 
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for to determine whether a specific nematode strain 
has "increased sensitivity of the insulin signalling 
pathway" compared to wildtype and/or CB1370, and which 
may be used within the scope of the present invention. 
5 Preferably, the insulin-type signal that is used 

to determine whether a specific nematode strain has 
"increased sensitivity of the insulin signalling 
pathway" is a vanadate-type compound. The vanadate, may 
be used as a free base or as a suitable water-soluble 
10 salt, such as sodium orthovanadate. Preferably, the 

vanadate is used in an amount of between 0,01" and 100 
millimolar, more preferably between 0.1 and 10 
millimolar, such as 0.5 millimolar or 2.0 millimolar. 
Some specific conditions under which vanadates 
15 maybe used to determine whether a specific nematode 
strain has "increased sensitivity of the insulin 
signalling pathway" will be further described below. 

Thus, as will be clear from the above, the 
"insulin-type factor (s)" described above may be used 
20 to determine whether a strain has increased 

sensitivity of the insulin signalling pathway (i.e. 
compared to the wildtype and/or CB1370) and thus may 
be used within the scope of the invention. 

Generally, such a nematode strain useful in the 
25 invention will have "increased sensitivity of the 

insulin signalling pathway" due to a mutation. and/or 
an other gtenetically determined factor that provides 
such increased sensitivity. Such strains will also be 
referred to below as having a "sensitized genetic 
30 background", and some preferred examples thereof, such 
as DR1564 and CB1368, will be further . described below. 



WO 01/93669 PCT/IB01/01199 

- 16 - 

However, it is also within the scope of the 
invention to provide the strain (s) used with 
"increased sensitivity of the insulin signalling 
pathway " by other means, such as exposure to 
5 pheromones which increase such sensitivity, by gene 
suppression techniques such as RNAi, and/or by 
growing/cultivating the nematodes in the presence of 
an inducing or suppressing factor (such as population 
density, food concentration and temperature) . 

10 In particular, the nematode strain used may be a 

weak Daf mutant (i.e. a mutation abnormal in dauer 
formation) , in particular a Daf mutant that is weaker 
then the reference strain CB1370. For instance, it may 
be a age-2 mutant, or one of the other daf mutants 

15 mentioned herein. 

In particular, the nematode strain used may be a 
weak daf -2 mutant, in particular a daf -2 mutant that 
is weaker then the reference strain CB1370. 

For instance, the reference strain used may be 

20 have a Class-I mutation (as mentioned in Gems et al., 
supra), a mutation which provides a phenotype similar 
to - and preferably essentially the same as - a Class- 
I mutation, and/or a(nother) mutation in the ligand 
binding domain, such that the mutated receptor still 

25 has an active kinase domain, but the sensitivity to 
insulin-like signalling is impaired. However, in its 
broadest scope, the invention is not limited thereto, 
and other mutations may also be present, including 
Class II mutations, as long as the strain having the 

30 mutation still has increased sensitivity of the 

insulin signalling pathway, compared to the wildtype 
and/or the reference strain C. elegans CB1370. 
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It is also possible, in the assays of the 
invention, to use two or more different strains, e.g. 
one or more which have increased sensitivity of the 
insulin signalling pathway, and/or one or more 
5 references, e.g. wildtype or CB137 0. 

In one preferred, but non-limiting aspect of the 
invention, the sensitivity of the insulin signalling 
pathway of the nematode strain used may be expressed 
in terms of the "Insulin Sensitivity Value" (ISV), 
10 which may be determined in the following manner: 

A sample of nematode worms (preferably in the LI 
stage) is incubated for between 48 and 96 hours 
(preferably about 72 hours) separately with and - 
without an insulin-type signal (preferably a vanadate- 
.15 type compound), at a temperature of between 20 and 
25°C (such as 20, 21, 22, 23, 24 or 25°C) , in the 
presence of a suitable source of food (such as 
bacteria, e.g. between 0.05 and 0.5 % w/v, preferably 
about 0,125 % w/v), and using a suitable medium (such 
20 as S-buffer, M9 or one of the media described in the 
applications referred to above, and preferably S- 
buf fer) . 

After incubation, for both the sample with the 
insulin-type signal and the sample without the 
25 insulin-type signal compound, the number of worms in 
the sample that enter into the dauer state is 
determined, as a percentage of the number of worms in 
the original sample, i.e. as follows: 



30 



1) for the sample without the insulin-type signal: 

([the number of worms that enter the dauer state 
without insulin-type signal] divided by [the 
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total number of LI worms in the original sample] ) 
times [200%]. 

This percentage is herein referred to as "Percentage 
A". 

5 

2) for the sample with the insulin-type signal: 

([the number of worms that enter the dauer state 
with the insulin-type signal] divided by [the, 
total number of LI worms in the original sample] ) 
10 times [100%]. 

This percentage is herein referred to as "Percentage 

B". 

The Insulin Sensitivity Value may then be 
expressed as the absolute difference between 

15 "Percentage A" and "Percentage B" (i.e. as absolute 
value of ["Percentage A" minus "Percentage B"] ) . 

As the ISV is calculated as a difference between 
two percentages A and B, the ISV itself will be a 
percentage (for instance, when Percentage A is 90%, 

20 and percentage B is 10%, the ISV will be 90% - 10% = 
80%)-, and always positive as the absolute value is 
calculated (for instance, when Percentage A is 10% and 
Percentage B is 90%, the ISV will be | 10% - 90% | = |- 
80% | = 80%. 

25 In the invention, the nematode strain used 

preferably has an ISV that is greater. than the ISV for 
CB1370. In particular, the nematode strain used may be 
such that its ISV is more than 1% greater, preferably 
more than 5% greater, more preferably more than 10% 

30 "greater, even more preferably more than 20% greater 
than the ISV for CB1370 (e.g. calculated as the 
absolute difference between the ISV for the strain 
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used and the ISV for CB137 0, e.g. [ISV strain used] 
minus [ISV CB1370] ) . 

For example, depending upon the specific 
conditions of the test, CB1370 will usually have an 
5 ISV of <20%, more usually <10%, and often <5% (in 

essence, this means that under the conditions of the 
test, for CB1370, there is little no difference 
between the presence and the absence of the insulin 
type signal) . The ISV for wildtype will usually be 
10 even lower than the ISV for CB1370. 

For the strain used in the invention, under the 
same conditions of the test, the ISV will usually be 
>30 %, and is preferably >40%, and is even more * 
preferably >50%. (in essence, this means that under 
15 the conditions of the test, for the strain used, the 

difference between the presence and the absence of the 
insulin-type signal is preferably (much) larger than 
for CB1370) . 

Preferably, the ISV is determined using a 
20 vanadate-type compound such as sodium orthovanadate, 
although the invention in its broadest sense is not 
limited thereto. 

Thus, by determining the ISV in the manner 
outlined above, it can be determined whether a strain 
25 has increased sensitivity of the insulin signalling 
pathway, compared to the wild-type and/or the 
reference 'strain CB1370. 

Generally, the invention is based on the insight 
that such nematode strains having increased 
30 sensitivity of the insulin signalling pathway can be 
used with advantage to provide improved methods for 
the selection of compounds for the field of metabolic 
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diseases, in particular compared to the assay 
techniques described in PCT US 98/10800 and US-A- 
6,225,120. As mentioned above, these methods may be 
used for drug discovery, development and pharmacology, 

5 for instance to discover and/or develop new small 

molecules and/or small peptides suitable for use in 
preventing or treating metabolic diseases in human or 
vertebrates (such as mammals) . 

For the purposes of the present disclosure, a 

0 "small molecule" generally means a molecular entity 

with a molecular weight of less than 1500, preferably 
less than 1000. This may for example be an organic, 
inorganic or organometallic molecule, which may also 
be in the form or a suitable salt, such as a water- 

5 soluble salt. 

The term "small molecule" also covers complexes, 
chelates and similar molecular entities, as long as 
their (total) molecular weight is in the range 
indicated above. 

.0 In a preferred embodiment, such a "small 

molecule" has been designed according, and/or meets 
the criteria of, at least one, preferably at least any 
two, more preferably at least any three, and up to all 
of the so-called Lipinski rules for drug likeness 

5 prediction (vide Lipinksi et al., Advanced Drug 

Delivery Reviews 23 (1997), pages 3-25). As is known 
in the art, small molecules which meet these criteria 
are particularly suited (as starting points) for the 
(design and/or) development of drugs (e.g) for human 

0 use, e.g. for use in (the design and/or compiling of) 
chemical libraries for (high throughput screening) , 
(as starting points for) hits-to-leads chemistry, 
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and/or (as starting points for) lead development. 

In a preferred embodiment, such a "small 
molecule'' has been designed according, and/or meets 
the criteria of, at least one, preferably at least any 

5 two, more preferably at least any three, and up to all 
of the so-called Lipinski rules for rational drug 
design (vide Lipinksi et al., Advanced Drug Delivery 
Reviews 23 (1997), pages 3-25). As is known in the, 
art, small molecules which meet these criteria are 

10 particularly suited (as starting points for) the 

design and/or development of drugs (e.g) for human use 

Also, for these purposes, the design of such 
small molecules (as well as the design of libraries 
consisting of such small molecules) preferably also 

15 takes into account the presence of pharmacophore 

points, for example according to the methods described 
by I. Muegge et al., J. Med. Chem. 44, 12 (2001), 
pages 1-6 and the documents cited herein. 

The term "small peptide * generally covers 

20 (oligo) peptides that contain a total of between 2 and 
35, such as for example between 3 and 25, amino acids 
(e.g. in one or more connected chains, and preferably 
a single chain) . It will be clear that some of these 
small peptides will also be included in the term small 

25 molecule as used herein, depending on their molecular 
weight. 

Thus/ the methods of the invention may in 
particular be used to test and/or screen (libraries 
of) such small molecules and/or peptides, in the 
30 manner as further outlined herein. 

Thus, in one aspect, the invention relates to the 
use of at least one nematode worm which has an 
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increased sensitivity of the insulin signalling 
. pathway (compared to the wildtype and/or the reference 
strain CB1370) , in an assay for the identification of 
a compound, such as a small molecule and/or a small 

5 peptide, which is capable of modulating insulin 

signalling pathways (for example in C. elegans and/or 
vertebrates, such as humans and/or other mammals), 
more generally of altering and/or effecting the 
biological response to insulin signalling, and even 

10 more generally for use in (the preparation of 

compositions for) the prevention and/or treatment of 
metabolic diseases or disorders (as mentioned above) , 
in vertebrates such as humans or other mammals. 
In. addition to the identification of small 

15 molecules and/or small peptides, according to the 
inventions, the nematode worms with an increased 
sensitivity of the insulin signalling pathway may also 
be used for determining the influence or effect of 
gene suppression (e.g. by RNAi techniques), and of 

20 specific or non-specific mutations (e.g. due to non- 
specific or (site-) specif ic mutagenesis). 

Preferably, the nematode worm with increased 
sensitivity of the insulin signalling pathway has a 
sensitized genetic background (compared to the 

25 wildtype and/or the reference strain CB1370) , as 
defined above. 

Even ; more preferably, the nematode worm with 
increased sensitivity of the insulin signalling 
pathway (e.g. a sensitized genetic background) has an 

30 ISV which is greater than the ISV for wildtype and/or 
CB1370, and even more preferably an ISV as defined 
above . 
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Some preferred, but non limited examples of 
suitable C. elegans strains include, but are not 
limited to: DR1564: daf-2(m41), CB1368 : daf-2 (el368) 
and some of the (other) strains mentioned in Gems et 
5 al., supra. Other suitable strains will be clear to 
the skilled person, based upon the disclosure herein. 

The most preferred nematode strain is DR1564: 
daf-2(m41). 

The sample of nematodes may comprise any suitable 

10 number of worms, depending on the size of the 

container/vessel used. Usually, the sample will 
comprise between 2 and 500, in preferably between 3 
and 300, more preferably between 5 and 200, even more 
preferably between 10 and 100 nematodes. When the 

15 assay is carried out in multi-well plate format, each 
well usually contains between 15 and 7 5 worms, such as 
20 to 50 worms. Although not preferred, it is not 
excluded that a sample may consist of a single worm. 
Usually, each such individual sample of worms 

20 will consist of worms that - at least at the start of 
the assay - are essentially the same, in that they are 
of the same strain, in that they contain the same 
mutation(s), in that they are essentially of an 
isogenic genotype, in that they show essentially the 

25 . same phenotype (s) , in that they are essentially 

"synchronised" (i.e. at essentially the same stage of 
development, such as LI or dauer. It should however be 
noted that this stage of development may - and usually 
will - change during the course of the assay, and not 

30 for all worms in the sample at the same rate and/ or \ in 
the same way), in that they have been grown/cultivated 
in essentially the same way, and/or in that they have 
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been grown under and/or exposed to essentially the 
same conditions, factors or compounds, including but 
not limited to pheromones, gene suppression (such as 
by RNAi), gene- or pathway-inducing factors or (small) 
5 molecules, and/or gene- or pathway-inhibiting factors 
or (small) molecules. However, in its broadest sense, 
the invention is not limited thereto. 

The medium may further contain all factors, 
compounds and/or nutrients required to carry out the 

10 assay and/or required for the survival, maintenance 
and/or growth of the worms. For this, reference is 
again made to the prior art, such as the applications 
and handbooks referred to above. In one specific ■ 
embodiment, the medium may also contain a suitable 

15 source of food for the worms - such as bacteria (for 
example a suitable strain of E. coli) - in a suitable 
amount . 

In the method of the invention, the sample of 
nematodes can be kept - e.g. maintained, grown or 

20 incubated - in any suitable vessel or container, but 
is preferably kept in a well of a multi-well plate, 
such as standard 6, 24, 48, 96, 384, 1536, or 3072 
well-plates (in which each well of the multi-well 
plate may contain a separate sample of worms, which 

25 may be the same or different) . Such plates and general 
techniques and apparatus for maintaining/ handling 
nematode worms in such multi-well plate format are 
well known in the art, for instance from the 
applications' mentioned hereinabove. 

30 The sample of nematodes may be kept in or on any 

suitable medium - including but not. limited to solid 
and semi-solid media - but is preferably kept in a 
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suitable liquid or viscous medium (e.g. with a 
viscosity at the temperature of the assay that is 
equal to a greater than the viscosity of M9 medium, as 
measured by a suitable technique, such as an 

5 Ubbelohde, Ostwald and/or Brookfield viscosimeter) . 

Generally, suitable media for' growing/maintaining 
nematode . worms will be clear to the skilled person, 
and include for example the media generally used in 
the art, such as M9, S-buffer, and/or the further 

10 media described in the applications and handbooks 
mentioned hereinabove. 

Preferably, the assays of the invention are based 
on the dauer phenotype as a biological read out, e.g. 
the entry into, the bypass of and/or the rescue from 

15 the dauer state, and/or any other property which 

results from and/or is associated with the so-called 
dauer decision. 

For instance, an assay based upon entry 
into/bypass of the dauer state may comprise. the 

20 following steps: 

a) providing a sample of nematode worms (preferably 
eggs, Ll or L2 worms, and most preferably LI 
worms ) ; 

b) keeping said sample under conditions such, without 
25 the presence of any compound (s) to be tested, at 

least 50%, and preferably at least 60 %, and more 
preferably at least 70 %, even more preferably at 
least. 80 %, such as 85-100% of the nematodes 
present in said sample would enter the dauer state 
30 (at least during the time used for the assay, such 

as at least 1 day, for example 2-4 days - e.g. 
about 72 hours - as further described below); 
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c) exposing the sample to the compound ( s ) to be 
tested; 

d) measuring either the number of worms that enter the 
dauer state, and/or measuring the number of worms 

5 that grow into adults. 

Preferably, in such an assay, the conditions used 
in step b) are such that, in the presence of a . 
reference compound (such as a vanadate compound, e.g. 
sodium orthovanadate) at a suitable concentration 

10 (such as between 0.5 and 2 milliMolar, which is 

particularly suited for vanadate), the amount of worms 
that enter the dauer state is at least 10% less (i.e. 
lower in absolute difference of percentages as also 
referred to above), preferably at least 20% less, more 

15 preferably at least 30% less, than the amount of worms 
that enter the dauer state without the presence of any 
such reference compound (at least during the time used 
for the assay, such as at least 1 day, for example 2-4 
days - e.g. about 72 hours - as further described 

20 below) . 

For instance, the conditions used in step b) may 
be such that, in the presence of a reference compound 
(such as a vanadate compound, e.g. sodium 
orthovanadate) at a suitable, concentration (such as 

25 between 0.5 and 2 milliMolar, which is particularly 
suited for vanadate) , the amount of worms that enter 
the dauer state is less than 50%, preferably less than 
40%, even more preferably less than .30% (at least 
during the time used for the assay, such as at least 1 

30 day, for example 2-4 days - e.g. about 72 hours - as 
further described below, and depending on the amount 
of worms that would enter the dauer state without the 



WO 01/93669 



PCT/IB01/01199 



- 27 - 

presence of the reference) , although the invention in 
its broadest sense is not limited thereto. 

An assay based upon rescue from the dauer state 
5 may comprise the following steps: 

a) providing a sample of nematode worms in the 
dauer state; 

b) keeping said sample under conditions such that, 
without the presence of any compound to be 

10 tested, least 50%, and preferably at least 60 %, 

and more preferably at least 7 0 %, even more 
preferably at least 80 %, such as 85-100% of 
the nematodes present in said sample would 
remain in the dauer state (at least for the time 

15 , of the assay, such as between 1 and 96 hrs, such 

as between 12 and 72 hours, such as about 24-4 8 
hours) ; 

c) exposing the sample to the compound (s) to be 
tested; 

20 d) measuring either the number of worms that remain 

in the dauer state, and/or measuring the number 
of worms that go out of the dauer state (e.g. 
become adults) . 
Preferably, in such an assay, the conditions used 
25 in step b) are such that, in the presence of a 

reference compound (such as a vanadate compound, e.g. 
sodium orthovanadate) at a suitable concentration 
- (such as between 0.5 and 2 milliMolar, which is 
particularly suited for vanadate) , the amount of worms 
30 that remain in the dauer state is at least 10% less 
(i.e. lower in absolute difference of percentages as 
also referred to above), preferably at least 20% less, 
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more preferably at least 30% less, than the amount of 
worms that remain in the dauer state without the 
presence of any such reference compound (at least 
during the time used for the assay, such as between 1 
5 and 96 hrs, such as between 12 and 72 hours, such as 
about 24-48 hours) . 

For instance, the conditions used in step b) may 
be such that, (such as a vanadate compound, e.g. 

10 sodium orthovanadate) at a suitable concentration 
(such as between 0.5 and 2 milliMolar, which is 
particularly suited for vanadate), the amount of worms 
that remain in the dauer state is less than 50%, * 
preferably less than 40%, even more preferably less 

15 than 30% (at least during the time used for the assay, 
such as between 1 and 96 hrs, such as between 12 and 
72 hours, such as about 24-48 hours, and depending on 
the amount of worms that would remain in the dauer 
state without the presence of the reference) , although 

20 the invention in its broadest sense is not limited 
thereto. 

Techniques for distinguishing, in a sample, and 
preferably in an automated and/or multi-well plate 

25 format, the number of adults and/or the number of 
dauers will be clear to the skilled person and may 
include visual/manual techniques, and/or the non- 
visual detection techniques described in the 
applications referred to above. 

30 In the assays of the invention, each individual 

sample of nematode worms will generally be exposed to 
a single compound to be tested, at a single 
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concentration; with different samples (e.g. as present 
in the different wells of the multi-well plate used) 
being exposed either to different concentrations of 
the same compound (e.g. to establish a dose response 
5 curve for said compound) , to one or more different 
compounds (which may for instance be part of a 
chemical library and/or of a chemical class or series, 
such as a series of closely related structural 
analogues), or both (e.g. to the same and/or different 

10 compounds at different concentrations) . 

It is also within the scope of the invention to 
expose the (sample of) nematodes to two or more 
compounds - at essentially the same time or 
sequentially (e.g. with an intermediate washing step) 

15 - for example to determine whether the two compounds 
have an effect which is the same or different from 
both the compounds separately (e.g. to provide a 
synergistic effect or an inhibitory or competitive 
effect) . 

20 Furthermore, it is within the scope of the 

invention to use one or more reference samples , e.g. 
samples without any compound (s) present, and/or with a 
predetermined amount of a reference compound. The 
invention also includes the use, in an assay, of two 

25 or more samples of nematode worms of different 
strains, e.g. to compare (the effect of the 
compound (s). to be tested on) the different strains, in 
which said different strains may also be reference 
strains, such. as wildtype, N2 or Hawaiian. 

30 In a preferred embodiment, an assay based on 

dauer entry /bypass is carried out in a multiwell plate 
format, under the following conditions: 
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use of a sample of between 2 and 100, preferably 
between 10 and 80 , more preferably between 15 and 
60 worms, such as 20 or 50 worms, preferably eggs, 
LI or L2, most preferably LI. 
5 - a temperature of between 10°C and 30 °C, 

preferably between 20°C and 27 °C, such as 21, 22, 

23, 24, 25 or 26°C ; depending on the specific 
strain used. 

For example, for DR1564: daf-2(m41), usually a 
10 temperature of about 21, 22, 23, 24 °C will be 

preferred, with a temperature of between 21 and 
22°C being particularly preferred. 

For CB1368: daf-2 (el368) , usually a temperature of 

24, 25 or 26°C will be preferred, with 25°C being 
15 particularly preferred. 

a concentration of the compound (s) to be tested 
of between 0.1 nanomolar and 100 milimolar, 
preferably between 1 nanomolar and- 10 milimolar, 
more preferably between 1 micromolar and 200 

20 micromolar, such as about 20 micromolar. The 

compound may be taken up by the nematodes in any 
suitable manner, such as by drinking, soaking, via 
the gastrointestinal tract (e.g. the gut), via the 
cuticle (e.g. by diffusion or an active transport 

25 mechanism) , and/or via openings in the cuticle, 

such as* amphid sensory neurons. Generally, the 
compound will be mixed with or otherwise 
incorporated into the medium used; 

a time of contact with the compound (s) to be 

30 tested of between 0.1 minute and 100 hours, 

preferably between 1 minute and 90 hours, such as 
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10 



15 



about 1 hour to 72 hours. For instance, the sample 
of nematodes may be contacted with the compound (s) 
to be tested for only a brief period of time, e.g. 
between 1 minute and 2 hours, such as between 20 
minutes and 1.5 hours, upon which the sample of 
nematodes may be washed and further cultivated on 
fresh medium (i.e. without compound), or the sample 
of nematodes may be contacted with the compound (.s) 
to be tested for essentially the entire duration of 
the assay (e.g. for 1-3 days or more). For assays 
involving (the bypass of) dauer formation (e.g. 
starting from LI) , the time of contact will 
generally encompass two or mores stages of 
development, and most preferably be between 1 and 4 
days, such as about 2-3 days (e.g. 48 to 72 hours) . 

a (total) time of incubation of the sample of 
between 0.1 minute and 100 hours, preferably 
between 1 minute and 90 hours, such as about 1 hour 
to 72 hours. For assays involving dauer 
entry/bypass (e.g. starting from Ll), the total 
incubation time will generally encompass two or 
mores stages of development, and most preferably be 
between 1 and 4 days, such as about 2-3 days (e.g. 
48 to 72 hours) ; 

- the use of a liquid or viscous medium (in which 
viscous is as defined above), such as S-buffer, 
M9 or one of the other media referred to in the 
patent applications mentioned above (as referred 

• to above), with S-buffer being particularly 

30 preferred. 

- The presence of a suitable source of food - for 
example bacteria such as E. coli - in a suitable 
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amount, e.g. between 0.001 and 10 % (w/v) , 
preferably between 0.01 and 1%, more preferably 
between 0.1 and 0.2 %, such as about 0.125 % w/v, 
based on the total medium. 
5 Conditions for assays based on dauer rescue are 

further described below and/or in PCT US 98/10800 and 
US-A-6,225,120. 

Although the conditions described above are 
particularly preferred, more generally, according to 
10 the invention, the nematode strains with increased 
sensitivity of the insulin signalling pathway (as 
further defined above) may be used, with advantage in 
any C. elegans-based assay technique involving and/or 
relating to insulin-signalling, insulin signal 
15 transduction, biological responses to insulin and/or 
insulin-type compounds, and/or the insulin pathway. 
These assays may be based on any suitable phenotypical 
read out, including but not limited to dauer entry, 
bypass and/or rescue as described above. 
20 Therefore, in accordance with one aspect of the 

invention, there is provided a method for the 
identification of a compound which is capable of 
modulating insulin signalling pathways, which method 
comprises: 

25 providing C. elegans larvae of a strain of 

sensitized genetic background to the insulin 
signalling pathway; 

contacting said larvae with a test compound in 
growth favouring conditions, i.e. including food; and 
30 screening for growth to adulthood, i.e. bypass of 

or release from the dauer larval state. 

A "sensitized genetic background" may be defined 
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herein by comparison to the reference daf-2 allele, 
el370 (Figure 2 is a print of the acedb database entry 
on daf-2) . The term "sensitized genetic background" 
encompasses C. elegans strains which exhibits greater 
5 sensitivity to test compounds than the daf- 
2(el370) allele. 

The method of the invention is suitable for use 
with essentially any C. elegans strain which exhibits 
a dauer phenotype as a result of defect, for example a 
10 mutation, in a gene encoding a component of the 
insulin signalling pathway or other intervention 
affecting the insulin signalling pathway and which 
exhibits a "sensitized genetic background" as compared 
to the daf-2 (el 370) mutant. 
!5 , in a preferred embodiment the method of the 

invention may be carried out using C. elegans strain 
DR1564 containing the daf-2 (m41) mutation which 
exhibit a dauer-constitutive phenotype.- Use of 
strains carrying this allele in compound screens based 
20 on bypass of/rescue from dauer is illustrated in the 
accompanying Examples. Table 6 compares the activity 
of 94 compounds, which were found to be positive in a 
primary screen of 8,000 compounds using DR1564: 
daf-2(m41), as part of Example 1, in a retest on the 
25 m41 allele bearing strain DR1564 and on the daf-2 

alleles bearing strains CB1368: daf-2 (el368) and daf- 
2(el370) . /DR1564: da f -2 (m41) was found to be more 
sensitive to compound activities than CB1368: daf- 
2(el368), with 56% and 27% confirmation rate, 
30 respectively. The strain CB1370 containing the daf-2 
reference allele e!370 could not be rescued by any of 
the 94 compounds. 
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Other sensitized backgrounds in addition to daf- 
2 (mil) may be used in accordance with the invention. 
Since both m41 and el368 belong to class I alleles in 
the classification of Gems et al. 1998, Genetics 150: 
129-155, while el370 belongs to class II, it is likely 
that other class I alleles are also useful as 
sensitized genetic background. Typically class I 
alleles are mutations in the ligand binding domain, 
and class II mutations are located in the kinase 
domain. The precise molecular lesion of m41 is 
unknown. 

Other C. elegans strains with sensitized genetic 
backgrounds which may be used in accordance with the 
invention include strains exhibiting a dauer phenotype 
whi-ch comprise loss of function or reduction of 
function mutations in genes downstream of the insulin 
receptor {daf-2) . A particular example is the age-1 
mutation, a mutation in the catalytic subunit of the 
PI3-kinase (see Figure 1 and table 1). While gain of 
function alleles of akt-1 or pdk-1 are not able to 
rescue daf-2 (el370) , they do rescue age-2 mutations 
(Paradis and Ruvkun 1998, Genes & Dev 12:2488-2489, 
Paradis and Ruvkun 1999, Genes & Dev 13:1438-1452). 

While there are no mutations known in the 
regulatory subunit of the PI3-kinase (located on the 
yac clones Y119C1 and Y110A7), knock-out mutations in 
these genes may be generated by methods known by the 
art (Zwaal et al. 1993, PNAS 90: 7431-35; Liu et al. 
1999, Genome Research 9:859-867). Other suitable 
strains carry loss of function mutations in the genes 
encoding AKT protein . kinases . Since there are two 
redundantly acting AKT potein kinases (Paradis and 
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Ruvkun 1998, Genes & Dev 12:2488-2489), a double 
mutation of knock-outs of both akt-1 and akt-2 may be 
to be constructed by simple crossing. Another 
potential useful mutation is the loss of function 

5 mutation in pdk-1 (sa680) , as described in Paradis and 
Ruvkun 1999, above cit. 

In a further embodiment of the method of the 
invention, a C. elegans strain having a sensitized, 
genetic background may be obtained by inhibiting 

10 proteins of the insulin-receptor pathway using 
specific inhibitor compounds. In particular,, 
inhibitors of the PI3-kinase are known, such as 
Wortmannin and LY294002. Barbar et al. 1999, Neurobiol 
Aging 20:513-519 demonstrate the activity of LY294002 

15 in inducing dauer formation. The inventors own 

experiments also illustrate the activity of Wortmannin 
(Figure 4) . 

RNAi inhibition is still another method of 
generating C. elegans strains with loss of function 

20 phenotypes suitable for use in the method of the 
invention. Methods of inhibiting expression of 
specific genes in C. elegans using RNAi are well known 
in the art and described, for example by Fire et al . , 
Nature 391:801-811 (1998); Timmins and Fire, Nature 

25 395:854 (1998) and Plaetinck et al., WO 00/01846. 
Most preferred are the techniques described in WO 
00/01846 which use special bacterial strains as food 
source to obtain double stranded RNA inhibition. . 
In yet another embodiment of the present . 

30 invention, sensitized strains may be used which 
comprise gain of function mutations of daf -18 or 
daf- 16 or of the C. elegans homologs of PTP-1B or 
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SHIP2. Generation -of gain of function mutations of 
serine or threonine phosphorylation sites, as 
disclosed for daf-16 by Paradis and Ruvkun 1998, above 
cit., and by Kops et al. 1999, Nature 398: 630-634, is 
5 straightforward for researchers experienced in the 
state of the art, as demonstrated by Nakae et al. 
2000, EMBO 19: 989-996 for FKHR, a human homologue of 
daf-1 6. 

Yet another sensitized genetic background may be 

0 derived by using mutants defective in perception of 
environmental signals that regulate insulin 
signalling, such as pheromone, food and temperature 
signals, or mutations in the neural processing of said 
signals, or mutations in the secretion of insulin-like 

5 molecules or in one of the genes encoding for an 
insulin-like molecule. In a preferred embodiment 
tph-1 (mg280) is used, a mutant deficient in tryptophan 
hydroxylase, necessary for serotonin biosynthesis. C. 
elegans worms with this mutation accumulate large 

0 stores of fat and to some extend form dauer larvae 
- because of inability to process the- food sensation, 
together with impaired temperature sensation (Sze et 
al. 2000, Nature 403: 560-564). Other suitable 
sensitized genetic backgrounds comprise daf-c 

5 mutations in daf-1, daf-4, daf-1, daf-8, daf-11, 
daf-14, daf-21, daf-1 9 or daf-28. Furthermore, 
dominant activation mutations in neuronal G proteins, 
as described by Zwaal et al. 1997, Genetics 145: 
715-727, may also serve as sensitized background. 

0 Several synthetic dauer forming mutations are 

known, which enhance other genetic backgrounds to form 
dauer mutations. One specific example, the double 
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unc-64 (e24 6) ; unc-31 (e928) , is given by Ailion et al. 
1999, PNAS 96, 7394-7397. Since unc-64 encodes for a 
homolog of syntaxin, a protein involved in synaptic 
transmission and other types of Ca 2+ -reulated 
5 secretion and unc-32 encodes for a homolog of CAPS, 
Ca 2+ -dependent activator protein for secretion and 
insulin release in pancreatic ii cells is determined by 
Ca 2+ -regulated secretion the simplest model is that, 
the Daf-c phenotype of the double mutation is caused 
.10 by a shut down of release of either insulin like 
molecules themselves or of neurotransmitters that 
stimulate insulin release (Ailion et al. 1999, PNAS 
96, 7394-7397). 

Sensitized worm strains which comprise any 

15 combination of two or more synthetic dauer formation 
mutations amongst each other, or in combination with 
dauer constitutive mutations, as examples are provided 
above, or any .combination of dauer constiutive 
mutations with each other may be used in the method of 

20 the invention. An example can be drawn from Ogg et 
al. 1997, Nature 389: 994-999, where a daf-2; daf-1 
double mutant induces dauer formation at temperatures 
far below temperatures necessary for each of the 
single mutation to induce dauer formation. 

25 The disclosed screening method is based on 

bypass of /release from the dauer larval state. There 
are several different ways in which to screen for 
bypass of/release from the dauer state which may be 
used in accordance with the invention, as described 

30 below. Furthermore, it is possible to use phenotypes 
of Daf genes other than dauer, including but limited 
to, fat storage, regulation of metabolic enzymes or 
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stress resistance pathways or any other biochemically, 
transcriptionally or posttranscriptionally regulated 
effect that is measurable as the basis of an assay 
read-out in accordance with the invention. 

5 

In accordance with a second aspect the invention 
also provides a method for the identification of a 
compound which is capable of modulating insulin 
signalling pathways , which method comprises: 

10 providing C. elegans larvae of a strain of 

sensitized genetic background to the insulin 
signalling pathway; 

contacting said larvae with a test compound in 
growth favouring conditions, i.e. including food; and 

15 , screening for growth to adulthood, i.e. bypass of 

or release from the dauer larval state, wherein 
conditions of assay are selected such that a basal 
level of bypass of or release from the dauer larval 
state is observed in the absence of the test compound. 

20 The second aspect of the present invention 

comprises of a sensitized assay condition, in contrary 
to tight screening conditions usually performed in 
screens to isolate genetic suppressors of daf-2, e.g. 
daf-16 alleles (Riddle et al. 1981, Nature 

25 290:668-671; Gottlieb & Ruvkun 1994, Genetics 137: 
107-120). 

The inventors provide a method of setting the 
assay conditions in way that a basal level of release 
from the dauer larval state is already present in 
30 controls. The basal level of release from the dauer 
larval state may for example be measured by counting 
the number of worms growing beyond the dauer stage in 
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a sufficiently large number of control wells 
(containing the solvent alone but no test compounds) . 
The basal level of release from the dauer larval 
state will preferably be between 0.1% and 60% rescue, 

5 more preferably between 1% and 50% rescue and most 

preferably between 2% and 40% rescue, such as 10% to 
20% rescue. While the minimal number of growing worms 
or residual activity is derived from sensitizing the 
assay conditions, the maximal number is derived from 

10 experience to optimise signal to noise ratio. 

Although in a preferred embodiment the method of 
the invention uses the temperature sensitivity of daf- 
2 mutations, such as m41, to sensitize assay 
conditions, any set of conditions that sensitize the 

15 assay over the strict genetic screen conditions is 
within the scope of the invention, in particular 
. conditions that show growth between 0.1% and 60%, 
preferentially between 1% and 50%, most preferentially 
between 2% and 40%, such as 10% to 20%, in cases where 

20 the readout of the assay is related to bypass of or 
release from the dauer-constitutive phenotype. 

Another embodiment of the invention uses genetic 
means to sensitize assay conditions to the desired 
basal level of release from the dauer larval state. 

25 For example Ogg & Ruvkun (1998), Mol. Cell 2: 887-893, 
disclose a double mutation daf-2; daf-18, which gives 
rescue (L4 and adults) at a level of 2.2%. In 
addition, mutations known as Daf-d for dauer 
defective, especially weak mutations, can be used in 

30 the present invention. Also gain of function . 

mutations, as there are known pdk-1 (mgl 42) , (Paradis 
and Ruvkun 1999, Genes & Dev 13:1438-1452) and 
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akt~l (mgl44) , (Paradis and Ruvkun 1998, Genes & Dev 
12:2488-2489), can be used to rescue from dauer 
formation to a certain percentage. Furthermore, gain 
of function, in particular at phosphorylation sites, 
5 or loss of function mutations can be generated by 

methods known in the art (see citations in the section 
further above) . 

Also suitable for use in the method of the 
invention are C. elegans strains which comprise a 

10 mutation in a gene downstream of the insulin receptor 
in the insulin signalling pathway which leads to a 
reduction in the function of the product of the 
mutated gene but not a complete loss of function.* 
Residual activity of the product encoded by the gene 

15 mutated in such strains may be sufficient to confer a 
basal level of release from the dauer larval state. 

Another embodiment of the invention comprises the 
incomplete loss of function typically seen with RNAi 
experiments. Since the disclosed methods rely on 

20 growth of worms in presence of E. coli, methods of 

obtaining RNA inhibition via feeding of appropriately 
engineered bacterial strains may be used as discribed 
in Plaetinck et al . , WO 00/01846. 

Still another embodiment of the invention 

25 comprises incomplete rescue typically obtained by 
heterologous transgenes. For example, a strain 
daf-16; daf-2; Ex [daf-1 6b :' : hsFKHR ] has been 
constructed in which daf-16 loss of function, in 
itself rescuing from daf-2 induced dauer formation, is 

30 rescued by the human homolog FKHR under the C. elegans 
daf-16b promoter. This rescue is incomplete, to about 
60% dauer formation, so that 40% grow to adulthood 
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(Gary Ruvkun, personal communication) . Any other 
homologue of daf-16, for example the human genes 
FKHRL1 or AFX, or others, mammalian or human, could be 
used in combination of suitable promoters, either one 
5 of the endogenous daf-16 promoters, daf-16a or daf-16b 
or both, or a heterologous promoter, preferably with 
ubiquitous expression or nervous system expression. 

Still another embodiment of the invention is , 
based on the addition of pheromone preparations so 
10 that the fraction of worms growing adults is driven 

below 60%, preferably below 40%, more preferably below 
4 0%, such as between 10% and 20%. As already 
mentioned, Sze and co-workers (Nature 403: 560-564) 
generated a tph-1 (mg280) mutation, which induces dauer 
15 arrest at 15%, mimicking low food supply and with some 
resistance to temperature control. However, since the 
dauer arrest can be enhanced to 80% using a daf-7 
mutation, which are defective in production of a TGFli 
like molecule signalling the absence of pheromone, 
20 addition of pheromone could achieve the desired level 
of 80% dauer formation as an alternative to the double 
mutant. Pheromone preparations may be obtained after 
the method of Golden & Riddle 1984, PNAS 81: 819-823. 
This screening method of the invention is again 
.25 based on bypass of/release from the dauer larval state 
and there are several different ways of screening for 
bypass of /release from dauer which may be used in 
accordance with the invention, see below. The 
invention can as well be based on any other phenotype 
30 relating to the insulin pathway, such as are observed 
in daf-2 mutations, including but not exclusive to fat 
storage, regulation of metabolic enzymes or stress 
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resistance pathways or any other biochemically, 
transcriptionally or posttranscriptionally regulated 
effect that is measurable. 

5 Set out below are ways of screening for bypass of 

or release from the dauer larval state which may be 
used in accordance with the invention. 

One of the simplest and most exact methods of, 
measuring bypass of/rescue from dauer larvae formation 

10 is counting of adults. Counting of adults may be 

achieved using automated means, e.g. automatic plate 
readers, allowing the screen to be performed in mid- 
to-high throughput format in multiwell microtiter* 
plates . 

15 , A further method of screening for bypass of or 

rescue from the dauer phenotype exemplified herein is 
based on staining of adults using Nile Red an 
automated data acquisition (Example 2) . Other methods 
of screening for release from the dauer larval state 

20 are also encompassed by the invention. 

As an alternative to direct counting of adults 
indirect measurements, for example the consumption of 
food by measuring turbidity, may form a usable 
readout . 

25 

Further methods of screening for bypass 
of/release from the dauer larval state are based on 
the use of reporter transgene. Suitable reporter 
transgene constructs generally comprise a promoter or 
30 promoter fragment operably linked to a reporter gene. 
The promoter or promoter fragment is one which is 
capable of directing strong gene expression in adult 
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C. elegans but no or weak gene expression in dauer 
larvae, such as a promoter which is regulated by the 
daf-2 signalling pathway (e.g. promoters regulated by 
the transcription factor daf-2 6) or vice versa {i.e. 
5 no or weak expression in adult, strong expression in 
dauer larvae. The term "operably linked" refers to a 
juxtaposition in which both components function in 
their intended manner, i.e. the promoter drives 
expression of the reporter gene. One example of a. 
10 suitable transgene is a construct comprising the C. 

elegans vit-2 promoter operably linked to a luciferase 
reporter gene. Any other promoter that shows strong 
expression in adults but no or weak expression in* 
dauer larvae may be used as an alternative to the vit- 
15 2 promoter. Other reporter genes may be used as 

alternatives to luciferase. Preferably the reporter 
gene will be one encoding a product which is directly 
or indirectly detectable in the worm, for example a 
fluorescent, luminescent or coloured product, e.g. GFP 
20 or lacZ. Preferably expression of the reporter gene 
product in the worm will be measurable using an 
automated plate reader. 

The inventors provide methods, for constructing 
ctl-1: : luciferase and a sod-3 :: luciferase reporter 
25 transgenes, the ctl-2 and sod-3 genes encoding 

respective a cytosolic catalase with markedly increase 
expression in daf-2 dauer larvae (Taub et al. 1999,. 
Nature 399:162-166) and a manganese superoxide 
dismutase strongly up-regulated in daf-2 mutant adults 
30 (Honda and Honda 1999, FASEB 13: 1385-1393). The 

regulation of a mitochondrial manganese superoxide 
dismutase by daf-2 is of particular interest, since it 
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has recently been shown that overexpression of a 
Mn-SOD in vascular endothelial cells can suppress 
several pathways involved in hyperglycaemic damage , 
indicating that those damages are caused by production 
5 of superoxides (Nishikawa et al. 2000, Nature 404: 
787-790) . 

To perform a screen using a reporter transgene 
the transgene must first be introduced into the C. , 
elegans used in the screen. This may be achieved 
10 using standard techniques for the construction of 
transgenic C. elegans well known in the art and 
described, for example, in Methods in Cell Biology, 
Vol 48, Ed. H.F.Epstein and D.C. Shakes, Academic * 
Press. 
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Table 1: targets of the insulin receptor pathway 



Targets 


Human 
homologs 


Function 


Validation 

■ 


Desired 
intervent 
ion 


DAF-2 


IR 


Receptor tyros in 
kinase 


el391 equals het. mutation of 
an morbidly obese diabetic 
patient 


+ 




PTP-1B 


Protein tyrosin 
phosphatase 


Mouse k.o. insulin 
hypersensitive 


B 


DAF-2 ! 


IRS-1, - 
2 


Insulin receptor 
substrate 


IR/+; IRS-1/+ age onset 
diabetes, IRS2 diabetic 


+ 


AGE-1 


pllO 


PI3-kinase 

catalytic 

subunit 


pllO|J insulin responsive 


'+ 




p85/p55 


PI3-kinase 
regulatory 
subunit 


p85a k.o. insulin 
hypersensitive 


+/B 


DAF-18 


PTEN 


PI-3' 

phosphatase 


maternal and zygotic minus 
rescues daf-2(el370) 


B 




SHIP2 


PI-5' 

phosphatase 


Overexpression inhibits. AKT 
activation 


B 


PDK-1 


PDK1 


AKT 

phosphorylation 


gf rescues dauers, If. induces 
dauers 


+ . 


AKT-1, 
AKT-2 


AKT =PKB 


Forkhead TF 
phosphorylation 


gf rescues, double RNAi 
induce dauers 


+ 


DAF-16 


FKHR, 
FKHRL1 


Trans kription 
factor 


If rescues daf-2 (el370) 


B 
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The present invention will be further understood 
with' reference to the following Experimental examples, 
together with the accompanying Figures in which: 

Figure 1 illustrates the insulin receptor signalling 
pathway of C. elegans. 



10 



15 



Figure 2 is a print of the acedb database entry on 
daf-2. 

Figure 3 is a graph to show that vanadates can rescue 
the. genetic insulin resistance caused by 
daf-2 mutations in C. elegans in an assay 
based on bypass of/rescue from the dauer 
larval state. 



20 



Figure 4 is a graph to show that wortmannin further 
enhances insulin resistance caused by daf-2 
mutations in C. elegans in an assay based on 
bypass of/rescue from the dauer larval 
state. 



25 



Figure 5 scatter plot of mean and variance of 

controls for the screening experiment 

described in Example 1 (a) screening, (b) 
DRC. 



30 



Figure 6 shows distribution of controls and a maximum 
likelihood of fit of a negative binomial 
distribution for data generated in the 
screening experiment described in Example 1. 
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Figure 7 shows distribution of controls in % of the 
average of the plate for data generated in 
the screening experiment described in 
Example 1 . 

Figure 8 shows the results of a representative nile 
red staining' experiment (Example 2). 



5 



10 



15 



Figure 9 is a representation of pGQl. 

Figure 10 is a representation of pDW2020. 

Figure 11 shows the complete nucleotide sequence of 
pDW2020. 

Figure 12 shows the complete nucleotide sequence of 
pGQl. 

Figure 13 is a print of the acedb database entry on 
20 ctl-2. 

Figure 14 is a representation of pGQ2 . 
Figure 15 is a representation of pCluc6. 

25 

Figure 16 shows the complete nucleotide sequence of 
pCluc6. 

Figure 17 shows the complete nucleotide sequence of 
30 pGQ2. 
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Figure 18 is a print of the acedb database entry on 
sod-3. 

Figure 19 is a representation of pGQ3. 

5 

Figure 20 shows the complete nucleotide sequence of 
pGQ3. 

Figure 21 is a representation of pGQ4 . 

10 

Figure 22 shows the complete nucleotide sequence of 
PGQ4. 

Figure 23 illustrates the cloning of pCluc6. 

15 

Example 1: screening 23,040 compounds for activity in 
the insulin-receptor pathway. 

20 Materials used 

• 9cm plates seeded with OP50, 

• three weeks old stock plates of daf-2(m41) 

• M9 buffer 

• S-complete buffer 

25 • 96-well plates flat bottom NUCLON Surface 

• 96-well plates U-bottom for dilutions compounds 

• HB101 bacteria (routinely available) 

• compounds (80 per 96-well plates) 10mM concentration 
in 100% DMSO 

30 
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Method 

Test of the batch of bacteria to be used as food: 

• Growth of HB101 

fill a 2 liter Erlenmeyer sterile with 0,51 DYT 

5 medium 

inoculate with E-coli HB101 single colony 
let shake for 24 hours at 250 rpm and 37 C 
centrifuge in sterile 250ml centrifuge tubes 10 
min lOOOOrpm. 

10 - resuspend in 120 ml S-basal medium (pipette up 
and down and shake) 

transfer to 8 15ml falcon tubes that were weighed 
in advance 

centrifuge second time 10 min 6000rpm 
15 - weigh the pellet 

store at 4 C 
• Test of the batch: 

chunk a couple of plates of m41 

bleach plates after 4 days, let eggs hatch on 
20 unseeded plate at 15 C 

* wash off .Li's after one night 

bring 50 LI in 80 ul S-complete in one 96 well 

plate 

add 10 ill 2% DMSO 
25 - add lOyil of 1.25% of the batch of bacteria to be 

tested 

put plate in closed box in the 21 C incubator 
check on number of dauers after three days of 
growth, should be no more then .10 
30 - if the batch is approved, it can be stored 
undiluted at. 4 C for several weeks 
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Protocol 

Thursday: 

chunk 9 cm plates (take 1 plate/96-well plate to 
be filled) 

5 - grow in middle incubator at 15 C (preferably same 
shelf) 

Monday : bleach plates 

wash off in M9 
10 - 10 plates/falcon 15ml 

put washed off plates back in 15 C incubator 

(only uncontaminated ones) 

spin down at 1300rpm/3min 

suck off M9 
15 - add bleach 

when most worms are broken, add sucrose , shake, 

add 2 ml M9 

spin at 1300rpm/3 min 

carefully remove eggs from bottom of layer of M9, 
20 bring in new falcon 

add M9 to 15ml 
spin down 1300rpm/3min 
add M9 

spin down 1300rpm/3min 
25 - suck away M9 to 1ml 

divide eggs from one falcon over 3 unseeded 
plates 

put plates at 15 C to let eggs hatch 



30 



Tuesday : 

a) preparation of the compound-plates 
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dilute aliquot of compound in 96-well plate to 
200uM in S-buffer (DMSO cone. 2%). 
replicate plates: four plates 10\xl 200pM compound 
- per well 

5 - write number and replicate number on plates 
if there was no DMSG in col 1 and 12 of the 
aliquoted plate it has to be added (add llpl of 
2% DMSO) 

write number of the plate and the replicate on 
10 the lid of the plates 

b) preparation of the worms solution 

1) "bleached Li's" 

wash Ll off plates in S-complete, 4 plates/15ml 

15 falcon 

spin down at 1300rpm/3min 

add fresh S-complete to 100ml 
- count worms in 10 pi 

keep worm suspension at 15 C while counting 
20 - dilute further to approximately 50 worms/80 yl, 

count again 

mix well 

2) "washed Li's" 

25 - wash off plates that were washed yesterday 

spin down (1300rpm/3min) , add S-complete, wash, 
twice 

filter suspension over 11 micron mesh over 
embroidery hoop into lid of 9cm plate 
30 wash Li's one more time 

dilute to 50 worms/80)il in the same way as 
bleached Ll 
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c) Final steps: 

add 1.25% freshly diluted HB101 bacteria to worm 
suspension so that final concentration is 0.125% 
5 (1 volume of bacteria to 8 of worms) 

add 90 ]il of worm-bacteria suspension/well with 
electronic pipette 

put plates in closed boxes with wet tissues in 
21°C incubator 

10 - monitor temperature in control box in incubator 
while growing (try to put boxes at the same 
shelf, avoid contact of the boxes to metal of 
cooling device ! ) 

15 Friday: Scoring: 

1. count 8 negative control wells/plate 

2. plot the average and variance of the negative 
controls from each plate 

3. check for differences between boxes, differently 
20 treated Li's and replicates 

4. if necessary define . several . groups, remove 
outliers 

5. make a distribution of the negative controls per 
group (plot # of wells to the number of 

25 worms /well) 

6. for each defined group: fit a negative binomial 
distribution to the negative controls and 
determine the number of adults for a cut-off 
confidentiality of about 1% and about 0.1% (both 

30 sides for screen of dauer rescue and dauer 

enhancers) 



WO 01/93669 



PCT7IB01/01199 



- 53 - 

7. screening for dauer rescue is possible if average 
of negative control is between 0 and 15 
adults/well, screening for dauer enhancers is 
possible if the average is above 5 
5 8. screen through the plates and count the wells 
with high number of adults 

9. if the number of adults in the well is below the 
cut-off value leave it 

10. if the number of adults is above or at the 1% 
10 cut-off value circle the well as positive (for 

each of the replicate with a different color) and 
write the number in the circle 

11. if the number of adults is above the 0.1% cut-off 
value estimate the number of adults 

15 12.- Put the lids of the 4 replicates of the same 
plate on top of each other 

13. Search for wells with 2 or more positives. in the 
4 (or 3) replicates 

14. Write down the number of the adults of each of 
20 the 4 (or 3) replicates 

Robustness 

While the controls active in the pathway show the 
sensitivity of the assay (see Figures 2 and 3), its 

25 specificity is determined by testing a range of 
compounds' outside the pathway. Together with the 
reference compounds acting' in the insulin signalling 
pathway, of which only Wortmannin and vanadates were 
active, anti-diabetics with a mode of action outside 

30 the insulin pathway, including 3 guanidine derivatives 
(acting on glucose uptake and metabolism), 5 PPARy 
ligands (stimulating adipocyte differentiation) and 6 
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sulphonylureas (which act by increasing insulin 
secretion) were tested. None was found to be active in 
the assay. Further confirmation of the specificity of 
the screen is derived from testing a library of 800 
5 compounds from Tocris-Cookson, containing mainly 

neurological actives, at 20 ]iM in triplicates. Only 4 
compounds rescued dauer formation, a rate not higher 
than for random libraries (see results) . , 



Table 2 



Name of compound 


supply 


NIW 


drug class/ disease area/ action(s) 


solvent 

1 


Concentrations 
tested in jiM- 
(lethal) 
rescue! 

dauer enhancer 


Synthalin 


ICN 


354.5 


guanidine derivative, also NMDA 
antagonist 


DMSO 


(333; 166.7; 83.3; 
33.3); 20; 16.6; 
8.3; 3.3 


Metformin HC1 (1,1- 
dimethylbiguanide) 


Sigma 


165.6 


guanidine derivative, biguanides, 
MOA?: decrease hepatic glucose 
production 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


Phenformin HCI 
(phenethyibiguanide) 


Sigma 


241.7 


guanidine derivative, biguanides, 
MOA?: decrease hepatic glucose 
production 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


HNMPA(AM)3 1 


Calbioc 
hem 


454.4 


insulin receptor tyrosine kinase inhibitor 


DMSO 


20 


Rapamycin 


ICN 


914.2 


insulin signalling enhancer, inhibitor of 
the mammalian target of rapamycin 
(mTOR) which is a downstream target 
of Akt and implicated in Akt's negative 
regulation of insulin signalling i.e. 


DMSO 


33.3; 16.6; 8.3; 
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serine/threonine phosphorylation of 
IRS-1 






Quercetin 


Sigma 


338.3 


insulin signalling inhibitor, inhibitor of 
phosphatidylinositol 3-kinase and of 

cotmral nth or ATTP_rofti lirinn on7\/mPC 

several uuiei r\i r-requinng en^ynieo 

e.g. PI4K, PKC, EGFR, calcium, 
SERCA activator by interacting with 
nucleotide bindina site to mask PLB 
inhibition 


DMSO 


20 


okadaic acid 


Calbioc 
hem 


805 


insulin signalling inhibitor, inhibits PP2A 
and PP1 


DMSO 


10; 5; 2.5; 0.6 


PD 98059 


Calbioc 
hem 


267.3 


insulin signalling inhibitor, MEK1 
inniDiior 


DMSO 


20 


Wortmannin 


Sigma 


42BA 


insulin signalling inhibitor, 
phosphatidylinositol 3-kinase 
inniDiior (poieni ana specmcj, 
inhibitor of neutrophil activation and 
of FMLP-mediated phospholipase D 
activation 


DMSO 


20 


LY 294002 


Sigma 


307.3 


insulin signalling inhibitor, 
phosphatidylinositol 3-kinase inhibitor 
(specific) 


DMSO 


100, 20 


phorbol 12-myristate 
13-acetate (PMA) 


Biomol 


616.8 


insulin signalling inhibitor, PKC activator 
(elicits serine/threonine phosphorylation 
of IRS-1) 


DMSO 


20 


Phosphatidyhnosrtol- 

^ 4 *wtrteDhosnhafe 

O p l , LI I^LSI IvwL/l ICUW 

[stearyl, arachidonoyl, 
tetraammonium salt) 


Alexis 


1123.1 


insulin signalling, identical to 
endoaenous Plf3 4 5^P3 (not an analoo 
containing only saturated fatty acid 
residues, therefore greater biological 
activity), activates Ca2+-insensitive 
PKC, activates Akt (a serine/threonine 
kinase) by directly interacting with the 
Akt pleckstrin homology (PH) domain 


UIVIOU 


Z.o, 1 .4, {J.f 
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Phosphatidylinositol- 
3,4-bisphosphate [L- 
alpha-] (dipalmitoyl, 
pentaammonium salt) 


Calbioc 
hem 


1056.2 


insulin signalling, mimics endogenous 
PI(3,4)P2, activates Ca2+-insensitive 
PKC, activates Akt (a serine/threonine 
kinase) by directly interacting with the 
Akt pleckstrin homology (PH) domain 


DMSO 


3.17; 1.9; 1.58; 
0.79 


Phosphatidylinositol- 
3,4,5-trisphosphate 
[L-alpha-] 
(dipalmitoyl, 
heptaammonium salt) 


Calbioc 
hem 


1170.2 


insulin signalling, mimics endogenous 
PI(3,4,5)P3, activates Ca2+-in sensitive 
PKC, activates Akt (a serine/threonine 
kinase) by directly interacting with the 
Akt pleckstrin homology (PH) domain 


DMSO 


2.96; 1.74; 1.48 


Thalidomide 


ICN 


258.2 


insulin signalling, TNF inhibitor 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


Perhexiline 


Sigma 


393.6 


insulin, carbohydrate metabolism, 
inhibitor of myocardial carnitine 
palmitoyltransferase^ ("antidiabetics"), 
sodium, calcium, dual Na+/Ca2+ (T- 
type) channel blocker, anti-angina 
(coronary vasodilator), diuretic 


DMSO 


(333; 166.7; 83.3; 

33.3); 20$ 6.6; 

8.3; 3.3 ** 
i 


L-arginine 


Sigma 


174.2 


nitric oxide, insulin secretagogue (NO 
dependent) 


water 


333; 166.7; 83.3; 
33.3; 20 


D-arginine 


Sigma 


174.2 


nitric oxide, negative control of L- 
arginine (insulin secretagogue) 


water 


20 


LY 171883 


Sigma 


318.4 


PPARgamma activator (weak), 
selective LTD4 antagonist 


DMSO 


20 


linoleic acid (9,12- 
octadecadienoic acid) 


Sigma 


280.4 


PPARgamma ligand 


DMSO 


(333; 166.7; 83.3; 
33.3); 20; 16.6; 
o.o, o.o 


Linolenic acid 
(9.12,15- 

octadecatrienoic acid) 


Sigma 


278.4 


PPARgamma ligand 


DMSO 


(333; 166.7; 83.3; 
33.3); 20; 16.6; 
8.3; 3.3 


Eicosatetraynoic acid 
[5.8,11,14-] (ETYA) 


ICN 


296.5 


PPARgamma ligand, insulin sensitizers, 
eicosanoid 


DMSO 


333; 166.7; 83.3; 
33.3; 20 
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Rosiglitazone (BRL49653) 


359 


PPARgamma-specific agonist (insulin- 

r cinei+t'rinri nrnnorf ic^c U^pfi in tvn^ II 
SenSIUZ-lliy piupcl Uoo, uocu hi iyp^ it 

diabetes) 


water i 


309; 500; 263; 
135; 55; 27.6; 
13.85 


Chelerythrine chloride 


Sigma 


'iG'* ft 
O0O.O 


r\rr>f oin 1/inacp C*. inhihitnr fnntpnt 
[jroiclll Mlldoc \-> HiiiiiJiiuj ^jjuiciu, 

selective, IC50 0.7uM) 


DMSO 


10 


Cantharidic acid 


Sigma 


214.2 


protein phosphatase 2A inhibitor (IC50 
53 nM) 


DMSO 


20 


Phenylarsine oxide 


Calbioc 
hem 


168 


PTP inhibitor, also inhibits rio-Kinase 
activity 


DlvloVJ 




Bromotetramisole 
oxalate [L-p-] 


Biomol 


373.2 


PTP inhibitor, also well known inhibitor 
of alkaline phosphatase, mimics the 
action of orthovanadate in the 
potentiation of fluorouracil 
antiproliferative activity 


water 


20 


Bromotetramisole 
oxalate [D-p-] 


Biomol 


373.2 


PTP inhibitor, also well known inhibitor 
of alkaline phosphatase, mimics the 
action of orthovanadate in the 
potentiation of fluorouracil 
antiproliferative activity: inactive isomer, 
negative cQnirui 


water 


20 


Dephostatin 


Calbioc 
hem 


168.2 


PTP inhibitor, IC50 7.7uM, also nitric 

nviHo Hnnnr fctahlp WO Hnnnr for S- 

nitrosation of proteins) 


DMSO 


333; 166.7; 83.3; 
20 


vanadium(ll) chloride 


Aldrich- 
Sigma 


191 ft^ 


PTP inhihitnr vanadium romoound 


DMSO 


20 


vanadium(IH) chloride 


Aldrich- 
Sigma 


157.3 


PTP inhibitor, vanadium compound 


DMSO 


1000; 500; 250; 
100; 20 


vanadium(lll) oxide 


Aldrich- 
Sigma 


149.88 


PTP inhibitor, vanadium compound 


DMSO 


20 


vanadium(IV) oxide 


Aldrich- 


165.88 


PTP inhibitor, vanadium compound 


DMSO 


20 
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Sigma 










vanadium(V) oxide > 


Mdrich- 
Sigma 


181.88 


PTP inhibitor, vanadium compound 


DMSO : 


10 


vanadyl sulfate 


Mdrich- 
Sigma 


163 


PTP inhibitor, vanadium compound 


DMSO 


1000; 500; 250; 
100; 20 


vanadyl trifluoride 


Fluka- 
Sigma 


123.94 


PTP inhibitor, vanadium compound 


DMSO 


20 


mpV (Pic) (mono 
peroxo (picolinato) 
oxovanadate(V)) 


Calbioc 
hem 


257.1 


PTP inhibitor, vanadium compound 


DMSO 


1000; 500; 250; 
100; 20 


sodium 
metavanadate 


Sigma 


121.9 


PTP inhibitor, vanadium compound, 
also inhibits ATPase and alkaline 
phosphatase 


water. 


1000; 500; 250; 
100; 20 


sodium 
orthovanadate 


Sigma 


183,9 


PTP inhibitor, vanadium compound, 
also inhibits ATPase and alkaline 
phosphatase 


water 


1000; 500; 250; 
100; 20 


bpV (Phen) 
(Potassium 
Bisperoxo (1,10- 
phen anthroline) 
oxovanadate(V)) 


Calbioc 
hem 


404.3 


PTP inhibitor, vanadium compound, 
potent 


DMSO 


1000; 500; 250; 
100; 20 


bpV(bipy) (potassium 
bisperoxo(bipyridine) 
oxovanadate(V) 


Alexis 


326.2 


PTP inhibitor, vanadium compound, 
potent 


DMSO 


1000; 500; 250; 
100; 20 


bpV(Hopic) (di 5 

potassium bis 

peroxo(5-hydroxy 

pyridine-2- 

carboxyl)- 

oxovanadate(V) 


Alexis 


347.2 


PTP inhibitor, vanaaium compound, 
potent 


nMon 
UMoU 


•innn- <;nn* 9^n- 
100; 20 


bpV(pic) 


Alexis 


367.3 


PTP inhibitor, vanadium compound, 


DMSO 


1000; 500; 250; 
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(dipotassium 

bisperoxo(picotinat 

o)oxovanadate(V) 




1 


ootent 




100; 20 


acetohexamide 


CN 


324.4 


sulfonylureas, first generation, MOA: 
nsulin secretagogue by blocking 
K+(ATP) channels 


DMSO , 


333; 166.7; 83.3; 
33.3; 20 


chlorpropamide 


Sigma 


2767 


sulfonylureas, first generation, MOA: 
insulin secretagogue by blocking 
K+(ATP) channels 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


tolazamide 


Sigma 


311.4 


sulfonylureas, first generation, MOA: 
insulin secretagogue by blocking 
K+(ATP) channels 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


tolbutamide 

- 


Sigma 


270.3 


sulfonylureas, first generation, MOA: 
insulin secretagogue by blocking 
K+(ATP) channels 


DMSO 


333; 1662*83.3; 
33.3; 20 / 


glipizide 


RBI 


445.53 


sulfonylureas, second generation, 
MOA: insulin secretagogue by blocking 
K+(ATP) channels 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


glyburide 
(glybenclamide) 


Tocris 


494.1 


sulfonylureas, second generation, 
MUA. insulin secreiagogue uy uiuuiuny 
K+(ATP) channels 


DMSO 


333; 166.7; 83.3; 
33.3; 20 


diazoxide 


Tocris 


230.7 


potassium, K+ channel opener, 
avtivates ATP-sensitive K+ channels, 
antihypertensive, also stimulates K+ 
channels in pancreatic istet cells 
(prodiabetic side effects), diabetes 


DMSO 


333; 166.7; 83.3; 
33.3; 20 



WO 01/93669 



PCT/IB01/01199 



- 60 - 

Data acquisition 

All screening was done at 20 pM compound concentration 
in quadruplicates, except 2000 compounds of Diverset 
in triplicates. Confirmation was done at 4 

5 concentrations. Questionable dose responses were 

repeated, if necessary at lower concentrations and/or 
2 fold dilution steps. All worms that bypassed dauer 
stage, L4s and adults, were counted under a Leica MZ12 
dissection scope and together referred to as number of 

10 adults per well. First, the 8 negative controls 
(column 1) of all plates were counted, typically 
between 800 and 1280 (25 to 40 plates times 4 per 
screening session) . Data were transferred to Excel 
files and average and variance of the 8 controls of 

15 each plate calculated and plotted. 

Outliers of unusual high average or variance were 
removed for calculation, since they were found to have 
an inappropriately large effect on the calculations 

20 below (3 plates in the example of Figure 5a) . Counting 
errors were found to have a rather weak effect. The 
number of wells was plotted against the number of 
adults per well and a negative binomial distribution 
fitted by maximum likelihood. In some cases it was 

25 necessary to split a session in two or three different 
subsessions mainly due to differences in incubator 
location or worm handling. 

Then the number of adults per well where the 
30 cumulative negative binomial distribution was closest 
to 99% was determined and referred to as 1% cut-off. 
In the example shown in Figure 6, 20 adults per well 
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10 



20 



25 



30 



were at 1.10% indicating that the probability to have 
20 or more adults per well is 1.10%. This calculates 
to a 4% chance for a single false positive in 
quadruplicates, but only to a 0.07% chance for a 
double false positive. Therefore a compound is 
positive, if at least 2 replicates have values at the 
cut-off or higher. In addition the 0.1% cut-off was 
determined similarly (24 adults in the example shown 
in Figure 6) and if at least 2 replicates were 
reaching that stronger value the compound was referred 
to as strong positive. 



The plates were then screened through quickly to find 
wells with a high number adults, which were counted 
15 and. if found to reach the cut-off value the position 
on the lid was circled and the exact value written in 
the circle. For higher numbers above the 0.1% cut-off 
an estimate rather than an exact count proved 
sufficient. Finally the transparent lids of the 4 
replicate plates were stacked on top of each other and 
by looking through them it was determined whether 2 or 
more lids were circled in any position. For those 
positions all the positive values were written. into an 
excel file. 



For confirmation by dose response fresh compound in 
100% DMSO was used and from an initial dilution to 2% 
DMSO three further dilutions in 3.16 fold steps with a 
2% DMSO solution in S-buffer were prepared. In that 
way 4 concentrations, 20 uM, 6.3 uM, 2 uM and 0.63 uM 
were tested, all in 0.2% DMSO background. Both columns 
1 and 12 contained 0.2% DMSO as control. Each plate 
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contained 20 different compounds, with 4 
replica-plates of them. 



Table 3 





l 


compl 
2 


comp2 
3 


comp3 
4 


Comp4 
5 


comp5 
6 


comp6 
7 


comp7 
8 


comp8 
9 


comp9 
10 


comp J. 

11 


12 


A 


cntrl 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 




i£ uprJ 




cntrl 


B 


cntrl 


6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


cntrl 


C 


cntrl 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


cntrl 


D 


cntrl 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0. 6pM 


0.6>iM 


cntrl 


£ 


cntrl 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 


20pM 

i 


-cntrl 


F 


cntrl 


6pM 


6pM 


6pM 


6pM 


.6pM 


6pM 


6pM 


6pM 


6pM 


6pM 


cntrl 


G 


cntrl 


2uM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


2pM 


cntrl 


H 


cntrl 


0. 6pM 


0.6pM 


0.6pM 


0.6pM 


0 . 6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


0.6pM 


cntrl 






compl 
1 


compl 
2 


compl 
3 


Compl 
4 


compl 
5 


compl 
6 


compl 
7 


compl 
8 


compl 
9 


comp2 
0 





5 

"Cntrl"-abbreviation for control 



For some compounds an additional dose response with 7 
10 concentrations was made, mostly with 2 fold dilutions 
to obtain 20 uM, 10 uM, 5 uM, 2.5 uM, 1.25 uM, 0.63 uM 
and 0.31 uM. In that case also row H contained 
controls. Each plate contained 10 different compounds, 
with 4 replica-plates of them. An example of the 26 
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negative controls of 16 plates showes the variability 
of the mean while the standard deviation remained 
fairly constant (Figure 5b). Furthermore, the negative 
controls expressed as percentage of. the plate mean 

5 were approximately normal distributed (Figure 7) . 

Therefore all data were normalized according to the 
calculation below,- which centers value of no effect at 
0 and calibrates the y-axis to standard deviations, 
The concentrations are on the x-axis in logarithmic 

10 scale. All 4 replicates are plotted, in addition a 
smoothed line through the averages is plotted. 

value in SD= ( number of adults of the well -D/SD of the controls of the., set 
average controls of the plate 

15 

A compound was determined as confirmed and designated 
a hit when either the average or two of the 4 values 
reached 2.5 SD (corresponds to. 99.3% confidence) at 
any concentration and a reasonable dose-response is 
20 apparent. 

Results 

From 23.040 compounds a total of 300 positives were 
obtained during the screening, of which 173 could be 
25 reconfirmed. 
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Table 4 



library name 


size 


Positives 


confirmed 
hits 


' % re- 
confirmed 


hit rate 


Library 1 


2000 


33 


3: 


9% 


0.15% 


Library 2 


5040 


92 


62 


67% 


1.23% 


Library 3 


16000 


175 


108 


62% 


0.68% 


TOTAL ■ , ^'j: . - 


., 

:.<V-v i '230'4a ; *^-- 








• W\ \ v f- : 



To estimate the potency of the screen, that is to 
5 estimate what fraction of compounds that could have 
been identified with the assay have actually been 
identified during the screen, an analysis on 47 
compounds defining 11 chemical clusters has been 
performed: 36 of these compounds have been confirmed. 
10 Another 40 compounds, which were not found to be 

active in the original screen but are members of those 
clusters, were submitted -to dose response 
confirmation. 4 more hits have been identified. In 
total 40 compounds could be confirmed, 36 of the 
15 screen positives and 4 from the extra set. Hence 90% 
of the final hits of these clusters were detected in 
the original screen and 10% were missed. 
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Table 5 



Cluster 




conf ixrmed 
hits 


similar 
negatives 


extra 
hits 


final 
hits 


1 




4 


1 


0 


4 


3 




6 


7 


1 


7 


4 


7 


. 6 


1 


0 


6 


5 


4 


4 


1 


0 


4 


g 


3 


3 


5 


1 


4 


7 


5 


3 


1 


0 


3 


Q 

o 


-D 

J 


X 


7 


1 


2 


-* 

9 


5 


4 


13 


0 


4 


12 


5 


2 


1 


0 


2 


13 


2 


2 


2 


0 


2 


15 


2 


1 


1 


1 


2 


;-T<*tal-^^ 




-5 /J- .V ■:■„*■ 




V V Z 
• ■ V:, ; # • 

v. ;?4 *. -...r 


'' i ,'-,y:H'i',v'.T,- ; ;r -< 



Conclusions 

5 1. A mutation in the C. elegans insulin receptor, 
daf-2 (m41) , was used successfully in an 
pharmacological assay for compounds acting in the 
downstream pathway. 
2. The assay is sensitive enough to screen at 20 pM 

10 * compound concentrations, at which there were 
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nearly no problems due to lethality (27 of 
23,040). 

3. A hit rate of 0.7 5% from combinatorial chemistry 
libraries has been obtained, strongly dependent 

5 on the library. 

4. The screen is specific for the insulin receptor 
pathway and is unlikely to yield many hits 
upstream e.g. stimulating insulin release. 

5. The active compounds are candidates to cure 

10 insulin resistance and therefore of potential 

therapeutic use in type II diabetes and obesity. 

6. Since the compounds bypass the need of insulin 
they are also of potential use in type I 
diabetes . 

15 7. The major mode of compound entry in C. elegans is 
the gut which pre-selects for orally active 
compounds . 

8. The activity is retrieved from a whole-organism 
readout leaving intact tissue-specific insulin 
20 signalling and feedback loops. 



WO 01/93669 



PCT/IB01/01199 



- 67 - 

Table 6: Retest of 94 compounds at 20pM on 3 different 
daf-2 alleles, m41 at 211C, el368 and el370 at 251C. 
Values: 3: all replicates above 99% threshold, 2: 
median above 99.9% threshold, 1: median above 99% 
5 threshold, 0: median below 99% threshold. 



ID 




Plat 
e 


Row 


Col 


m41 


el368 


el37i 


217485 


547.18 


1 


A 


2 


1 


1 


0 


211706 


472.55 


1 


A 


3 


3 


3 


0 


181141 


459.51 


1 


A 


4 


3 


1 


0 


259910 


384.53 


1 


A 


5 


0 


0 


0 


194326 


393.49 


1 


A 


6 


2 


0 


0 


217336 


420.04 


1 


A 


7 


3 


3 


0 


267546 


372.51 


1 


A 


8 


0 


0 


0 


228433 


405.56 


1 


A 


9 


0 


0 


0 


264792 


436.94 


1 


A 


10 


3 


0 


0 


255126 


431.50 


1 


A 


11 


3 


0 


0 


100718 


399.88 


1 


B 


2 


3 


0 


0 


182576 


486.39 


1 


B 


3 


0 


0 


0 


232839 


475.30 


1 


B 


4 


3 


1 


0 


217339 


394.00 


1 


B 


5 


3 


1 


0 


217341 


394.00 


1 


B 


6 


3 


2 


0 


118776 


437.52 


1 


B 


7 


2 


0 


0 


118783 


452.35 


1 


B 


8 


3 


2 


0 


118789 


442.35 


1 


B 


9 


2 


1 


0 


248144 


440.89 


1 


B 


10 


3 


0 


0 


234291 


462.76 


1 


B 


11 


0 


0 


0 


212465 


367.39 


1 


C 


2 


0 


0 


0 


144331 


363.98 


1 


c 


3 


0 


0 


0 


138263 


372.51 


1 


c 


4 


2 


1 


0 


264982 


352.48 


1 


c 


5 


1 


1 


0 


267659 


386.93 


1 


c 


6 


1 


0 


0 


115771 


391.50 


1 


c 


7 


3 


0 


0 


105359 


326.40 


1 


c 


8 


3 


0 


0 


267467 


419.37 


1 


c 


9 


0 


0 


0 


236867 


480.25 


' 1 


c 


. 10 


0 


0 


0 


225671 


365.44 


1 


c 


11 


0 


0 


0 


225858 


: 444.33 


1 


D 


2 


0 


1 


0 


225615 


523.23 


1 


D 


3 


0 


1 


0 


101025 


431.42 


1 


D 


4 


1 


0 


0 


255192 


420.38 


1 


D 


5 


3 


1 


. o 


217850 


391.27 


1 


D 


6 


3 


0 


0 


214475 


329.36 


1 


D 


7 


3 


1. 


0 


114446 


47.9.71 


1 


D 


8 


2 


0 


0 


261736 


378.40 


1 


D 


9 


2 


0 


0 


210145 


373.84 


1 


D 


• 10 . 


0 
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5 Example 2: automatic data acquisitio n with Nile Red 
staining 

Material : 

10 Hardware: 

microtiterplates:96 well black U-shaped plates 

( DYNEX Microfluor7 2) 

Wallac 1420 plate reader (Victor 2) : 

Nile Red protocol: excitation = 53Q nm 
15 emission = 590 nm 

. Counting time: 1 second 

CW, lamp energy: 30445 

Emission aperture: damp 

Counter position: top 
20 Measurement height: 3 ram from bottom of the plate 



Consumable's : 

-Nile Red (Sigma, N-3013) 
Ivermectin (ICN, 196009) 
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Method: 

Prepare a 100 mM solution of Nile Red (Nile Blue 
A Oxazone) in pure methanol. Centrifugate to 
remove the saturated solution from the ■ 
5 undissolved Nile Red. 

Dilute in steps of 10 with buffer to 500 ]M. 

Add 1:1 Nile Red to the worms and incubate for 30 

min at room temperature. 

Add 10 jjlM ivermectin final concentration and 
10 incubate for 30 min at room temperature. 

Measure. 



Example 3: automatic data acquisition with a 
15 vit-2 : : lucif erase reporter 

Material: 

Hardware: 

microtiterplates:96 well white U-shaped plates 
20 • (DYNEX Microfluor a 2) 

Wallac 1420 plate reader (Victor 2) : 

Luciferase protocol 

Emission Filter: no filter 

Counting time: 3 seconds 
25 Emission aperture: normal 

Consumables : 

Triton X-100 (BDH, 306324N) 

Dual-Luciferasea Reporter Assay System (Pr omega, 
30 E4550) 
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Method: 

Add Triton X-100 (1% final concentration) to lyse 
the worms. 

Shake for 1 minute and freeze.- 
5 - Thaw the plates and add 1:1 luciferine. 
Shake for 1 minute and measure. 

Example 4: construction of ctl-1 : : lucif erase and 
10 sod-3 : : lucif erase reporters 

1) Construction of pGQl 

1.1 PCR 

15 

PCR (turbo pfu) on N2 genomic DNA with: 
oGQl:ctl-l: :GFP fw (PstI) : 

5- AAAACTGCAGCCAATGCATTGGAAGAGATATTTTGCGCGTCAAATATGTTTTGTGTCC3' 

oGQ2bis : ctl-1 : : GFP rv (BamHI) 

20 5 ' CGCGGATCCGGCCGATTCTCCAGCGACCG3 1 

1.2 Cloning 

- Digest of the PCR fragment with PstI and BamHI 

- Ligation into pDW2 020 and transformation into DH10B 

25 

2) Construction of pGQ2 
2.1 PCR 

30 PCR (turbo pfu) on N2 genomic DNA with: 
oGQ3 : ctl-1 :: lucif erase fw (StuI) : 

5' CCAGGCCTGAGATATTTTGCGCGTCAAATATGTTTTGTGTCC3' 

oGQ4 : ctl-1 :: lucif erase rv (Sad) 
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5 ' CGGAGCTCCGATTGGATGTGGTGAGCAGG3 1 

2.2 Cloning 

- Digest of the PCR fragment with StuI and SacI 

- Ligation into pCluc6 and transformation into DH10B 



3) Construction of pGQ3 

10 3.1 PCR 

PCR (turbo pfu) on N2 genomic DNA with: 
oGQ7:sod-3 fw: 

5 1 GCAGAATTTGCAAAACGAGCAGGAAAGTC 3 ' 

oGQ6:sod-3: : lucif erase rv (AscI) 

15 5 ' TTGGCGCGCCAAGCCTTAATAGTGTCCATCAGC3 ' 

3.2 Cloning 

- Digest of the PCR fragment with PstI and AscI 

- Ligation into pDW2020 and transformation into HD10B 



4) Construction of pGQ4 . " 

4.1 PCR 

25 

PCR (turbo pfu) on N2 genomic DNA with: 
oGQ7:sod-3 fw: 

5 1 GCAGAATTTGCAAAACGAGCAGGAAAGTC 3 1 

oGQ8:sod-3:: lucif erase rv (SacI) 

30 5 1 CTGAGCTCGGCTTAATAGTGTCCATCAGC3 1 

4.2 Cloning 

- Digest of the PCR fragment with PstI and SacII 
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- Ligation into pCluc6 and transformation into HD10B 

Example 5: Construction of pCluc6 

5 Vector: 

- Restriction digest of pCluc2 with Hindi I I 

- Purification, protocol: Jetsorb 
Insert : 

- PCR the vit-2 promoter (24 8 bp in front of exonl 
10 just before ATG ) with primers (designed from ACeDB 

C42D8.2) that contain Hindlll RE sites out of N2 
genomic DNA: 

vit-2F: 5 1 CCCCCAAGCTTCCATGTGCTAGCTGAGTTTCATCATGTCC3 1 
vit-2R : 5 1 CCCCCCAAGCTTGGCTGAACCGTGATTGG3 1 
15 - Restriction digest on PCR product with Hindlll. 

- Purification, protocol: Jetsorb 

pCluc6: 

- T4 DNA ligation of vector and insert 
20 - Transformation into DH10B 

- Mini DNA preparation, protocol : Wizard SV Miniprep 

- determine direction of insert by RE cleavage 
Xbal/Nhel 

- Maxi DNA preparation, protocol: Jetstar 

25 - Check maxiprep by sequencing with o-PUCI primer. 

Standard methods and worm strains 

Standard methods for culturing nematodes are described 
30 in Methods in Cell biology Vol. 48, 1995, ed. by 

Epstein and Shakes, Academic press. Standard methods 
are known for creating mutant worms with mutations in 
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selected C. elegans genes, for example see J. Sutton 
and J. Hodgkin in "The Nematode Caenorhabditis 
elegans", Ed. by William B. Wood and the Community of 
C. elegans Researchers CSHL, 1988 594-595; Zwaal et 

5 al, "Target - Selected Gene Inactivation in 

Caenorhabditis elegans by using a Frozen Transposon 
Insertion Mutant Bank" ,1993, Proc. Natl. Acad. Sci. 
USA 90 pp 7431 -7435; Fire et al, Potent and Specific 
Genetic Interference by Double-Stranded RNA in C. 

10 elegans 1998, Nature 391, 860-811. A population of 

worms can be subjected to random mutagenesis by using 
EMS, TMP-UV or radiation (Methods in Cell Biology, Vol 
48, ibid). Several selection rounds of PCR could- then 
be performed to select a mutant worm with a deletion 

15 in a desired gene. 

A range of specific C. elegans mutants are available 
from the C. elegans mutant collection at the C. 
elegans Genetic Center, University of Minnesota, St 
20 Paul, Minnesota. 

E. coli strain OP50 can be obtained from the C. 
elegans Genetics Center, University of Minnesota, St 
Paul, Minnesota, USA. 

25 
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CLAIMS : 

1. A method for the identification of a 
compound which is capable of modulating insulin 

5 signalling pathways, which method comprises: 

providing C. elegans dauer larvae; 

contacting said larvae with a test compound; and 

screening for release from the dauer larval state, . 

wherein the C. elegans dauer larvae possess a 
10 sensitized genetic background, as compared to the 

reference daf-2 mutant el370. 

2. Method according to claim 1, in which the 
dauer larvae belong to a nematode strain which has an. 

15 . Insulin Sensitivity Value ("ISV") that is greater than 
the ISV for the reference nematode strain CB1370, in 
particular more than 1% greater, preferably more than 
5% greater, more preferably more than 10% greater, 
even more preferably more than 20% greater. 

20 

.3. Method according to claim 1 and/or 2, in 
which the dauer larvae belong to a nematode strain 
which has an ISV that is >30 %, preferably >40%, even 
more preferably >50%. 

25 

4 . A method as claimed in claim 1 wherein the 
C. elegans dauer larvae are daf-2 (m41) mutants., 



30 



5. A method as claimed in claim 1 wherein the 
C. elegans dauer larvae comprise a daf-2 class I 
allele other than daf-2 (m41). 
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6. A method as claimed in claim. 1 wherein the 
C. elegans dauer larvae comprise at least one loss-of- 
function or reduction-of-function mutation in a 

5 gene(s) downstream of the insulin receptor in the 
insulin signalling pathway. 

7. A method as claimed in claim 6 wherein the 
C. elegans dauer larvae comprise a loss-of-f unction or 

10 reduction-of-function mutation in the age-1 gene. 

8. A method as claimed in claim 6 wherein the 
C. elegans dauer larvae comprise loss-of-f unction or 
reduction-of-function mutations in the akt-1 gene and 

15 the, akt-2 gene. 

9. A method as claimed in claim 6 wherein the 
C. elegans dauer larvae comprise a loss-of-f unction or 
reduction-of-function mutation in the pdk-1 gene. 

20 

10. A method as claimed in claim 9 wherein the 
C. elegans dauer larvae are pdk-1 (sa680) mutants. 

11. A method as claimed in claim 1 wherein the 
25 C. elegans dauer larvae are larvae wherein the dauer 

phenotype is induced by treatment, with an inhibitor 
inhibitor of at least one component of the insulin 
receptor signalling pathway. 



30 



12. A method as claimed in claim 11 wherein the 
inhibitor compound is an inhibitor of the C. elegans 
PI3-kinase - 
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13. A method as claimed in claim 12 wherein the 
inhibitor compound is wortmannin or LY294002. 

5 14. A method as claimed in claim 1 wherein 

expression of at least one gene downstream of the 
insulin receptor in the insulin receptor signalling 
pathway in said C. elegans dauer larvae is inhibited 
by RNAi inhibition. 

10 

15. A method as claimed in claim 1 wherein the 
C. elegans dauer larvae comprise a gain-of-f unction 
mutation in the daf-16 gene. 

!5 16. A method as claimed in claim 1 wherein the 

C. elegans dauer larvae comprise a gain-of-f unction 
mutation in the daf-18 gene. 

17. A method as claimed in claim 1 wherein the 
20 C. elegans dauer larvae comprise a gain-of-f unction 

mutation in the C. elegans homologue of the SHIP2 
gene. 

18. A method as claimed in claim 1 wherein the 
25 C. elegans larvae dauer comprise a gain-of-f unction 

mutation in the C. elegans homologue of the PTP-1B 
gene. 



30 



19. A method as claimed in claim 1 wherein the 
C. elegans dauer larvae exhibit a defect in perception 
of environmental signals. 
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20. A method as claimed in claim 19 wherein the 
said C. elegans dauer larvae comprise a mutation in 
the tph-1 gene. 

21. A method as claimed in claim 20 wherein the 
said C. elegans dauer larvae are tph-1 (mg2 80) mutants. 

22. A method as claimed in claim 1 wherein the 
C, elegans dauer larvae comprise a daf-c mutation in a 
daf gene selected from the group consisting of daf-1, 
daf-4, daf-7, daf-8, daf-11, daf '-14 , daf -21, daf-19 
and daf -28. 

23. A method as claimed in claim 1 wherein the 
C. .elegans dauer larvae comprise a mutation in a gene 
encoding a neuronal G-protein. 



24. A method as claimed in claim 1 wherein the 
c. elegans dauer larvae are unc-64 (e264) ; unc-31 

20 (e928) mutants. 

25. A method as claimed in any one of claims 1 
to 24 wherein the step of screening for release from 
the dauer larval state comprises screening for adult 

25 C. elegans. 

26. A method as claimed in any one of claims 1 

. to 24 .wherein the step of screening for release from 
the dauer larval state comprises screening for changes 
30 in- fat storage. 
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27. A method as claimed in any one of claims 1 
to 24 wherein said C. elegans dauer larvae further 
comprise a reporter transgene comprising a promoter 
which is capable of directing strong gene expression 

5 in adult C. elegans and no or weak expression in dauer 
larvae or vice versa operably linked to a reporter 
gene and the step of screening for release from the 
dauer larval state comprises screening for changes ■ in 
expression of the said reporter gene. 

10 

28. A method for the identification of a 
compound which is capable of modulating insulin 
signalling pathways , which method comprises: 
providing C. elegans dauer larvae; 

15 contacting said larvae with a test compound; and 

screening for release from the dauer larval state, 
wherein conditions of the assay are selected such that 
a basal level of release from the dauer larval state 
is observed in the absence of the test compound. 

20 

■ 29. A method as claimed in claim 28 wherein the 
basal level of release from the dauer larval state is 
between 0. 1% and 40%. 

25 30. A method as claimed in claim 29 wherein the 

basal level of release from the dauer larval state is 
between 1% and 30%. 



31. A method as claimed in claim 30 wherein the 
30 basal level of release from the dauer larval state is 
between 2% and 20%. . 
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32. A method as claimed in any one of claims 28 
to 31 wherein the C. elegans dauer larvae are daf- 
2 (mil) mutants. 

5 33. A method as claimed in any one of claims' 28 

to 31 wherein the C. elegans dauer larvae are daf-2; 
daf-18 double mutants. 

34. A method as claimed in any one of claims 28 
10 to 31 wherein the C. elegans dauer larvae are Daf-d 

mutants . 

35. A method as claimed in any one of claims 28 
to 31 wherein the C. elegans dauer larvae comprise a 

15 gain-of-f unction mutation in the pdk-1 gene. 

36. A method as claimed in claim 35 wherein the 
C. elegans dauer larvae are pdk-1 (mgl 42) mutants. 

20 37. A method as claimed in any one of claims 28 

to 31 wherein the C. elegans dauer larvae comprise a 
gain-of-f unction mutation in the akt-1 gene. 

38. A method as claimed in claim 37 wherein the 
25 C. elegans dauer larvae are akt-1 (mgl 4 4) mutants. 

39. A method as claimed in any one of claims 28 
to 31. wherein the C. elegans dauer larvae are daf-16; 
daf-2 double mutants and further comprise a transgene 

30 ! capable of expressing a mammalian homolog of the daf- 
16 protein. 
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40. A method as claimed in claim 39 wherein the 
mammalian homolog of the daf-16 protein is the human 
FKHR protein, the human FKHRL1 protein or the human 
AFX protein. 

41. A method as claimed in claim 28 wherein said 
C. elegans dauer larvae are larvae which have been 
treated with pheromone to reduce that fraction of 
worms growing to adults to below 40%. 

42. A method as claimed in claim 41 wherein said 
C. elegans dauer larvae are larvae which have been 
treated with pheromone to reduce that fraction of' 
worms growing to adults to below 30%. 

43. A method as claimed in claim 42 wherein said 
C. elegans dauer larvae are larvae which have been 
treated with pheromone to reduce that fraction of 
worms growing to adults to below 20%. 

44. A method as claimed in any one of claims 28 
to 43 wherein the step of screening for release from 
the dauer larval state comprises screening for adult 
C. elegans. 

45. A method as claimed in any one of claims 28 
to 43 wherein said C. elegans larvae further comprise 
a reporter transgene comprising a promoter which is 
capable of directing strong gene expression in adult 
C. elegans and no or weak expression in dauer larvae 
or vice versa operably linked to a reporter gene and 
the step of screening for rescue of the daf-2 mutation 
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comprises screening for expression of the said 
reporter gene. 

4 6. A method as claimed in any one of claims 28 
5 to 43 wherein the step of screening for release from 

the dauer larval state comprises screening for changes 
in fat storage. 

47. A method for the identification of a 
10 compound which is capable of modulating insulin 
signalling pathways, which method comprises: 
a) providing a sample of nematode worms (preferably 
eggs, LI or L2 worms, and most preferably LI 
worms) ; 

15 b) keeping said- sample under conditions such, without 
the presence of any compound (s) to be tested, at 
least 50% f and preferably at least 60 %, and more 
preferably at least 70 %, even more preferably at 
least 80 %, such as 85-100% of the nematodes 

20 present in said sample would enter the dauer state 

(at least during the time used for the assay); 

c) exposing the sample to the compound (s) to be 
tested; 

d) measuring either the number of worms that enter the 
25 dauer state, and/or measuring the number of worms 

that grow into adults. 

48. Method according to claim 47, in which the 
conditions used in step b) are such that, in the 
30 presence of a reference compound at a suitable 

concentration, the amount of worms that enter the 
dauer state is at least 10% less, preferably at least 
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20% less, more preferably at least 30% less, than the 
amount of worms that would enter the dauer state 
without the presence of any such reference compound 
(at least during the time used for the assay) . 

5 

49. Method according to claim 4 6 and/or 47, in 
which the conditions used in step b) are such that, in 
the presence of a reference compound at a suitable ; 
concentration, the amount of worms that enter the 

10 dauer state is less than 50%, preferably less than 
40%, even more preferably less than 30% (at least 
during the time used for the assay) . 

50. Method according to any of claims 47-49, in 
15 which the nematode worms that form the sample belong 

to a nematode strain that has an Insulin Sensitivity 
Value ("ISV") that is greater than the ISV for the 
reference nematode strain CB1370, in particular more 
than 1% greater, preferably more than 5% greater, more 
20 preferably more than 10% greater, even more preferably 
more than 20% greater. 

51. Method according to any of claims 47-50, in 
which the nematode worms that form the sample belong 

25 to a nematode strain which has an ISV that is >30 %, 
preferably >40%, even more preferably >50%. 

52.. Method according to any of claims 47-50, in 
which the nematodes used in the sample are daf-2 (m41) 
30 mutants . 

53. Use of at least one nematode worm, which has 
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an increased sensitivity of the insulin signalling 
pathway, in an assay for the identification of a 
compound which is capable of modulating insulin 
signalling pathways, 

5 

54. Use according to claim 53 , in which the 
nematode worm belongs to a strain that has an Insulin 
Sensitivity Value ("ISV") that is greater than the ■ ISV 
for the reference nematode strain CB1370, in 

10 particular more than 1% greater, preferably more than 
5% greater, more preferably more than 10% greater, 
even more preferably more than 20% greater. 

55. Use according to claim 53 and/or 54, in which 
15 the" nematode worm belongs to a strain that has an 

Insulin Sensitivity Value ("ISV") that is >30 %, 
preferably >40%, even more preferably >50% 

56. Use according to any of claims 53-55, in 

20. which the nematode worm used is a daf-2(m41) mutant. 

57. Use according to any of claims 53-56, in an 
assay that is carried out in a multi-well plate 
format . 

25 

58. Use according to any of claims 53-57, in an 
assay that is carried out in an automated fashion. 

59. Use according to any of claims 53-58, in an 
30 assay based on the dauer phenotype as a biological 

read out, such as on the entry .into, the bypass of 
and/or the rescue from the dauer state, and/or on any 
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other property which results from and/or is associated 
with the so-called dauer decision. 

60. Use according to claim 59, in an assay based 
on entry into the dauer state and/or bypass of the 
dauer state as a biological read out. 

61. Use according to claim 59, in an assay based 
on rescue from the dauer state as a biological read 
out. 

62. Use according to any of claims 53-61, for the 
identification of a small molecule and/or a small 4 
peptide. 
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Figure 2. The reference allele of daf-2 is e1370 



B{daf-2 



Name Gen*_ class daf 

Type Gene Ref erence_flllele C1370 

Phenotypo e!370tc : constitutive dauer formation 

at 25x; reversible by shift to 15x, ES3 
<L3). HP19. 
See also el032, el286, el365, el368, 

ei370, 61391 
EC.eletansin el370ts : constitutive 
dauor formation at 25C; reverelbie by 
shift to 15C Increased lifespanat 20C; 
increased thermotoierance, LA> 
resistance. Non-Srf. Synthetic lethal 
with daf-12. ES3 CL3>. Dfl>40: el 032, 
el286, el365, sa230 <100ZDaf-c at all 
temperatures), sa223 {sterile), m65 
(nnnconditional), etc. Most alleles 
{not ei370) hypersensitive to dauer 
pheromone. CLarsen et al. 1995; Halone 
and Thomas 1994; CF; JC3 
Molecular.inforroation Sequence EHBL:ftF012437.1 

Q1BL:RF012437.2 
Y55D5flJ591.b 

Map HI Position -9.68234 Error 0,059406 
Pocitiw* Inside^rearr nBPll 
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Figure 3: Na-ortho-vanadate rescues insulin resistance caused by daf-2(m41) 
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Figure 4: Wortmannin further enhances insulin resistance 
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Figure 5: Scatter plots of mean and variance of controls: a (left): screening, b (right): DRC 
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Figure 6: distribution of controls and a maximum likelyhood fit of a negative binomial distribution 
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Figure 7: distribution of controls in percent of the average of the plate. 



Variation of negative controls 




percent of plate average control 



DMcnrrtn- ^\\ir\ 



n-io'accoAO. t «. 



WO 01/93669 



5/74 



PCT/IB01/01199 



Figure 8 
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pDW2020 sequence: 

MCS I 

PstI BamHI 

1 ATGACCATGA TTACGCCAAG CTTGCATGCC TGCAGGTCGA CTCTAGAGGA 
TACT GG TACT AATGCGGTTC GAACGTACGG ACGTCCAGCT GAGATCTCCT 

MCS I synth. intron A 

BamHI 

51 TCCCCGGGAT TGGCCAAAGG ACCCAAAGGT ATGTTTCGAA TGATACTAAC 
AGGGGCCCTA ACCGGTTTCC TGGGTTTCCA TACAAAGCTT ACTATGATTG 

synth. intron A 



101 ATAACATAGA ACATTTTCAG GAGGACCCTT GGCTAGCGTC GACGGTACCA 
TATTGTATCT TGTAAAAGTC CTCCTGGGAA CCGATCGCAG CTGCCATGGT 



AscI GFP with introns 

151 TGGGGCGCGC CAT GAG T AAA GGAGAAGAAC TTTTCACTGG AGTTGTCCCA 
ACCCCGCGCG GTACTCATTT CCTCTTCTTG AAAAGTGACC TCAACAGGGT 

GFP with introns 

201 ATTCTTGTTG AATT AG AT GG TGATGTTAAT GGGCACAAAT TTTCTGTCAG 
TAAGAACAAC TTAATCTACC ACTACAATTA CCCGTGTTTA AAAGACAGTC 

GFP with introns 



251 TGGAGAGGGT GAAGGTGATG CAACATACGG AAAACTTACC CTTAAATTTA 
ACCTCTCCCA CTTCCACTAC GTTGTATGCC TTTTGAATGG GAATTTAAAT 

GFP with introns 



301 TTTGCACTAC TGGAAAACTA CCTGTTCCAT GGGTAAGTTT AAACATATAT 
AAACGTGATG ACCTTTTGAT GGACAAGGTA CCCATTCAAA TTTGTATATA 

GFP with introns 

351 ATACTAACTA ACCCTGATTA TTTAAATTTT CAGCCAACAC TTGTCACTAC 
TATGATTGAT TGGGACTAAT AAATTTAAAA GTCGGTTGTG AACAGTGATG 

c 

GFP with introns 



4 01 TTTCTGTTAT GGTGTTCAAT GCTTCTCGAG ATACCCAGAT CATATGAAAC 
AAAGACAATA CCACAAGTTA CGAAGAGCTC TATGGGTCTA GTATACTTTG 



GFP with introns 



i C 1 " B Ci f 1 



WO 01/93669 PCTYIB01/01199 

8/74 



Cow 



451 GGCATGACTT TTTCAAGAGT GCCATGCCCG AAGGTTATGT ACAGGAAAGA 
CCGTACTGAA AAAGTTCTCA CGGTACGGGC TTCCAATACA TGTCCTTTCT 

GFP with introns 



501 ACTATATTTT TCAAAGATGA CGGGAACTAC AAGACACGTA AGTTTAAACA 
TGATATAAAA AGTTTCTACT GCCCTTGATG TTCTGTGCAT TCAAATTTGT 

GFP with introns 

551 GTTCGGTACT AACTAACCAT ACATATTTAA ATTTTCAGGT GCTGAAGTCA 
CAAGCCATGA TTGATTGGTA TGTATAAATT TAAAAGTCCA CGACTTCAGT 

GFP with introns 



601 AGTTTGAAGG TGATACCCTT GTTAATAGAA TCGAGTTAAA AGGTATTGAT 
TCAAACTTCC ACTATGGGAA CAATTATCTT AGCTCAATTT TCCATAACTA 

GFP with introns 



651 TTTAAAGAAG ATGGAAACAT TCTTGGACAC AAATTGGAAT ACAACTATAA 
AAATTTCTTC TACCTTTGTA AGAACCTGTG TTTAACCTTA TGTTGATATT 

GFP with introns 

701 - CTCACACAAT GTATACATCA TGGCAGACAA ACAAAAGAAT GGAATCAAAG 
GAGTGTGTTA CATATGTAGT ACCGTCTGTT TGTTTTCTTA CCTTAGTTTC 

GFP with introns 



751 TTGTAAGTTT AAACTTGGAC TTACTAACTA ACGGATTATA TTTAAATTTT 
' AACATTCAAA TTTGAACCTG AATGATTGAT TGCCTAATAT AAATTTAAAA 



GFP with introns 



801 CAGAACTTCA AAATTAGACA CAACATTGAA GATGGAAGCG TTCAACTAGC 
GTCTTGAAGT TTTAATCTGT GTTGTAACTT CTACCTTCGC AAGTTGATCG 

GFP with introns 



851 AGACCATTAT CAACAAAATA CTCCAATTGG CGATGGCCCT GTCCTTTTAC 
TCTGGTAATA GTTGTTTTAT GAGGTTAACC GCTACCGGGA CAGGAAAATG 



GFP with introns 



901 CAGACAACCA TTACCTGTCC ACACAATCTG CCCTTTCGAA AGATCCCAAC 
GTCTGTTGGT AATGGACAGG TGTGTTAGAC GGGAAAGCTT TCTAGGGTTG 

GFP with introns 



951 GAAAAGAGAG ACCACATGGT CCTTCTTGAG TTTGTAACAG CTGCTGGGAT 
CTTTTCTCTC TGGTGTACCA GGAAGAACTC AAACATTGTC GACGACCCTA 

GFP with introns Fsel 



BNSDOCID: <WO 



. 0193669A2 I > 
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1001 TACACATGGC ATGGATGAAC TATACAAATA GGGCCGGCCG AGCTCCGCAT 
ATGTGTACCG TACCTACTTG ATATGTTTAT CCCGGCCGGC TCGAGGCGTA 

unc-54 3* UTR 



1051 CGGCCGCTGT CATCAGATCG CCATCTCGCG CCCGTGCCTC TGACTTCTAA 
GCCGGCGACA GTAGTCTAGC GGTAGAGCGC GGGCACGGAG ACTGAAGATT 



unc-54 3' UTR 



1101 GTCCAATTAC TCTTCAACAT CCCTACATGC TCTTTCTCCC TGTGCTCCCA 
CAGGTTAATG AGAAGTTGTA GGGATGTACG AGAAAGAGGG ACACGAGGGT 

unc-54 '3' UTR 



1151 CCCCCTATTT TTGTTATTAT CAAAAAAACT TCTTCTTAAT TTCTTTGTTT 
GGGGGATAAA AACAATAATA GTTTTTTTGA AGAAGAATTA AAGAAACAAA 

unc-54 3' UTR 



1201 TTTAGCTTCT TTTAAGTCAC CTCTAACAAT GAAATTGTGT AGATTCAAAA 
AAATCGAAGA AAATTCAGTG GAGATTGTTA CTTTAACACA TCTAAGTTTT 

unc-54 3' UTR 



1251 ATAGAATTAA TTCGTAATAA AAAGTCGAAA AAAATTGTGC TCCCTCCCCC 
TATCTTAATT AAGCATTATT TTTCAGCTTT TTTTAACACG AGGGAGGGGG 

unc-54 3' UTR 



1301 CATTAATAAT AATTCTATCC CAAAATCTAC ACAATGTTCT GTGTACACTT 
GTAATTATTA TTAAGATAGG GTTTTAGATG TCTTACAAGA CACATGTGAA 

unc-54 3« UTR 



1351 CTTATGTTTT TTTTACTTCT GATAAATTTT TTTTGAAACA TCATAGAAAA 
GAATACAAAA AAAATGAAGA CTATTTAAAA AAAACTTTGT AGTATCTTTT. 

unc-54 3' UTR 



1401 AACCGCACAC AAAATACCTT ATCATATGTT ACGTTTCAGT TTATGACCGC 
TTGGCGTGTG TTTTATGGAA TAGTATACAA TGCAAAGTCA AATACTGGCG 

unc-54 3' UTR 



1451 AATTTTTATT TCTTCGCACG TCTGGGCCTC TCATGACGTC AAATCATGCT 
TTAAAAATAA AGAAGCGTGC AGACCCGGAG AGTACTGCAG . TTTAGTACGA 

unc-54 3 1 UTR 



1501 CATCGTGAAA AAGTTTTGGA GTATTTTTGG AATTTTTCAA TCAAGTGAAA 
GTAGCACTTT TTCAAAACCT CATAAAAACC TTAAAAAGTT AGTTCACTTT 
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unc-54 3 1 UTR 



1551 GTTTATGAAA TTAATTTTCC TGCTTTTGCT TTTTGGGGGT TTCCCCTATT 
CAAATACTTT AATTAAAAGG ACGAAAACGA AAAACCCCCA AAGGGGATAA 

unc-54 3' UTR 



1601 GTTTGTCAAG AGTTTCGAGG ACGGCGTTTT TCTTGCTAAA ATCACAAGTA 
CAAACAGTTC TCAAAGCTCC TGCCGCAAAA AGAACGATTT TAGTGTTCAT 



unc-54 3' UTR 



TTGATGAGCA 


CGATGCAAGA 


AAGATCGGAA GAAGGTTTGG GTTTGAGGCT 


AACTACTCGT 


GCTACGTTCT 


TTCTAGCCTT 


CTTCCAAACC 


CAAACTCCGA 


unc-54 3* 


UTR 








CAGTGGAAGG 


TGAGTAGAAG 


TTGATAATTT 


GAAAGTGGAG 


TAGTGTCTAT 


GTCACCTTCC 


ACTCATCTTC 


AACTATTAAA 


CTTTCACCTC 


ATCACAGATA 


unc-54 3 1 


UTR 









1751 GGGGTTTTTG CCTTAAATGA CAGAATACAT TCCCAATATA CCAAACATAA 
CCCCAAAAAC GGAATTTACT GTCTTATGTA AGGGTTATAT GGTTTGTATT 

, unc-54 3 ' UTR 



1801 • CTGTTTCCTA CTAGTCGGCC GTACGGGCCC TTTCGTCTCG CGCGTTTCGG 
GACAAAGGAT GATCAGCCGG CATGCCCGGG AAAGCAGAGG GCGCAAAGCC 

1851 TGATGACGGT GAAAACCTCT GACACATGCA GCTCCCGGAG ACGGTCACAG 
ACTACTGCCA CTTTTGGAGA CTGTGTACGT CGAGGGCCTC TGCCAGTGTC 

1901 CTTGTCTGTA AGCGGATGCC GGGAGCAGAC AAGCCCGTCA GGGCGCGTCA 
GAACAGACAT TCGCCTACGG CCCTCGTCTG TTCGGGCAGT CCCGCGCAGT 

1951 GCGGGTGTTG ' GCGGGTGTCG GGGCTGGCTT AACTATGCGG CATCAGAGCA 
CGCCCACAAC CGCCCACAGC CCCGACCGAA TTGATACGCC GTAGTCTCGT 

2001 GATTGTACTG AGAGTGCACC ATATGCGGTG TGAAATACCG CACAGATGCG 
CTAACATGAC TCTCACGTGG TATACGCCAC ACTTTATGGC GTGTCTACGC 

2051 TAAGGAGAAA ATACCGCATC AGGCGGCCTT AAGGGCCTCG TGATACGCCT 
ATTCCTCTTT TATGGCGTAG TCCGCCGGAA TTCCCGGAGC ACTATGCGGA 

2101 ATTTTTATAG GTTAATGTCA TGATAATAAT GGTTTCTTAG ACGTCAGGTG 
TAAAAATATC CAATTACAGT ACTATTATTA CCAAAGAATC TGCAGTCCAC 

2151 GCACTTTTCG GGGAAATGTG CGCGGAACCC CTATTTGTTT ATTTTTCTAA 
CGTGAAAAGC CCCTTTACAC GCGCCTTGGG GATAAACAAA TAAAAAGATT 



2201 



ATACATTCAA ATATGTATCC GCT CAT GAGA CAATAACCCT GATAAATGCT 
TATGTAAGTT TATACATAGG CGAGTACTCT GTTATTGGGA CTATTTACGA 
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amp 

2251 TCAATAATAT TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG 
AGTTATTATA ACTTTTTCCT TCTCATACTC ATAAGTTGTA AAGGCACAGC 

amp 



2301 CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT TGCTCACCCA 
GGGAATAAGG GAAAAAACGC CGTAAAACGG AAGGACAAAA ACGAGTGGGT 

amp 

2351 GAAACGCTGG TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT 
CTTTGCGACC ACTTTCATTT TCTACGACTT CTAGTCAACC CACGTGCTCA 

amp 



24 01 GGGTTACATC GAACTGGATC TCAACAGCGG TAAGATCCTT GAGAGTTTTC 
CCCAATGTAG CTTGACCTAG AGTTGTCGCC ATTCTAGGAA CTCTCAAAAG 

amp 



24 51 GCCCCGAAGA ACGTTTTCCA ATGATGAGCA CTTTTAAAGT TCTGCTATGT 
CGGGGCTTCT TGCAAAAGGT TACTACTCGT GAAAATTTCA AGACGATACA 



amp 



2501 GGCGCGGTAT TATCCCGTAT TGACGCCGGG CAAGAGCAAC TCGGTCGCCG 
CCGCGCCATA ATAGGGCATA ACTGCGGCCC GTTCTCGTTG AGCCAGCGGC 



amp 



2551 CATACACTAT TCTCAGAATG ACTTGGTTGA GTACTCACCA GTCACAGAAA 
GTATGTGATA AGAGTCTTAC TGAACCAACT CATGAGTGGT CAGTGTCTTT 



amp 



2601 AGCATCTTAC GGATGGCATG ACAGTAAGAG AATTATGCAG TGCTGCCATA 
TCGTAGAATG CCTACCGTAC TGTCATTCTC TTAATACGTC ACGACGGTAT 



amp 



2651 ACCATGAGTG ATAACACTGC GGCCAACTTA CTTCTGACAA CGATCGGAGG 
TGGTACTCAC TATTGTGACG CCGGTTGAAT GAAGACTGTT GCTAGCCTCC 



amp 



2701 ACCGAAGGAG CTAACCGCTT TTTTGCACAA CATGGGGGAT CATGTAACTC 
TGGCTTCCTC GATTGGCGAA AAAACGTGTT GTACCCCCTA GTACATTGAG 



amp 



2751 GCCTTGATCG TTGGGAACCG GAGCTGAATG AAGCCATACC AAACGACGAG 
CGGAACTAGC AACCCTTGGC CTCGACTTAC TTCGGTATGG TTTGCTGCTC 
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amp 



2801 CGTGACACCA CGATGCCTGT AGCAATGGCA ACAACGTTGC GCAAACTATT 
GCACTGTGGT GCTACGGACA TCGTTACCGT TGTTGCAACG CGTTTGATAA 



amp 



2851 AACTGGCGAA CTACTTACTC TAGCTTCCCG GCAACAATTA ATAGACTGGA 
TTGACCGCTT GATGAATGAG ATCGAAGGGC CGTTGTTAAT TATCTGACCT 



amp 



2901 TGGAGGCGGA TAAAGTTGCA GGACCACTTC TGCGCTCGGC CCTTCCGGCT 
ACCTCCGCCT ATTTCAACGT CCTGGTGAAG ACGCGAGCCG GGAAGGCCGA 



amp 



2951 GGCTGGTTTA TTGCTGATAA ATCTGGAGCC GGTGAGCGTG GGTCTCGCGG 
CCGACCAAAT AACGACTATT TAGACCTCGG CCACTCGCAC CCAGAGCGCC 



amp 



3001 TATCATTGCA GCACTGGGGC CAGATGGTAA GCCCTCCCGT ATCGTAGTTA 
ATAGTAACGT CGTGACCCCG GTCTACCATT CGGGAGGGCA TAGCATCAAT 



amp 



3051 TCTACACGAC GGGGAGTCAG GCAACTATGG ATGAACGAAA TAGACAGATC 
AGATGTGCTG CCCCTCAGTC CGTTGATACC TACTTGCTTT ATCTGTCTAG 



amp 



3101 GCTGAGATAG GTGCCTCACT GATTAAGCAT TGGTAACTGT CAGACCAAGT 
CGACTCTATC CACGGAGTGA CTAATTCGTA ACCATTGACA GTCTGGTTCA 

3151 TTACTCATAT ATACTTTAGA TTGATTTAAA ACTTCATTTT TAATTTAAAA 
AATGAGTATA TATGAAATCT AACTAAATTT TGAAGTAAAA ATTAAATTTT 

3201 GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA AATCCCTTAA 
CCTAGATCCA CTTCTAGGAA AAACTATTAG AGTACTGGTT TTAGGGAATT 

3251 CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG 
GCACTCAAAA GCAAGGTGAC TCGCAGTCTG GGGCATCTTT TCTAGTTTCC 

3301 ATCTTCTTGA GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA 
TAGAAGAACT CTAGGAAAAA AAGACGCGCA TTAGACGACG AACGTTTGTT 

3351 AAAAACCACC GCTACCAGCG GTGGTTTGTT TGCCGGATCA AGAGCTACCA 
TTTTTGGTGG CGATGGTCGC CACCAAACAA ACGGCCTAGT TCTCGATGGT 

34 01 ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC AGAGCGCAGA TACCAAATAC 
TGAGAAAAAG GCTTCCATTG ACCGAAGTCG TCTCGCGTCT ATGGTTTATG 
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3451 TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG AACTCTGTAG 
ACAGGAAGAT CACATCGGCA TCAATCCGGT GGTGAAGTTC TTGAGACATC 

3501 CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC 
GTGGCGGATG TATGGAGCGA GACGATTAGG ACAATGGTCA CCGACGACGG 

3551 AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC 
TCACCGCTAT TCAGCACAGA ATGGCCCAAC CTGAGTTCTG CTATCAATGG 

3601 GGATAAGGCG CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA 
CCTATTCCGC GTCGCCAGCC CGACTTGCCC CCCAAGCACG TGTGTCGGGT 

3651 GCTTGGAGCG AACGACCTAC ACCGAACTGA GATACCTACA GCGTGAGCAT 
CGAACCTCGC TTGCTGGATG TGGCTTGACT CTATGGATGT CGCACTCGTA 

3701 TGAGAAAGCG CCACGCTTCC CGAAGGGAGA AAGGCGGACA GGTATCCGGT 
ACTCTTTCGC GGTGCGAAGG GCTTCCCTCT TTCCGCCTGT CCATAGGCCA 

3751 AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT CCAGGGGGAA 
TTCGCCGTCC CAGCCTTGTC CTCTCGCGTG CTCCCTCGAA GGTCCCCCTT 

3801 ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG 
TGCGGACCAT AGAAATATCA GGACAGCCCA AAGCGGTGGA GACTGAACTC 

3851 CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC 
- GCAGCTAAAA ACACTACGAG CAGTCCCCCC GCCTCGGATA CCTTTTTGCG 

3901 CAGCAACGCG GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC 
GTCGTTGCGC CGGAAAAATG CCAAGGACCG GAAAACGACC GGAAAACGAG 

3951 ACATGTTCTT TCCTGCGTTA TCCCCTGATT CTGTGGATAA CCGTATTACC 
TGTACAAGAA AGGACGCAAT AGGGGACTAA GACACCTATT GGCATAATGG 

4001 GCCTTTGAGT GAGCTGATAC CGCTCGCCGC AGCCGAACGA CCGAGCGCAG 
CGGAAACTCA CTCGACTATG GCGAGCGGCG TCGGCTTGCT GGCTCGCGTC 

4051 CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCCAATACGC AAACCGCCTC 
GCTCAGTCAC TCGCTCCTTC GCCTTCTCGC GGGTTATGCG TTTGGCGGAG 

4101 TCCCCGCGCG TTGGCCGATT CATTAATGCA GCTGGCACGA CAGGTTTCCC 
AGGGGCGCGC AACCGGCTAA GTAATTACGT CGACCGTGCT GTCCAAAGGG 

4151 GACTGGAAAG CGGGCAGTGA GCGCAACGCA ATTAATGTGA GTTAGCTCAC 
CTGACCTTTC GCCCGTCACT CGCGTTGCGT TAATTACACT CAATCGAGTG 

4201 TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCT CGTATGTTGT 
AGTAATCCGT GG3GTCCGAA ATGTGAAATA CGAAGGCCGA GCATACAACA 

4251 GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CT 
CACCTTAACA CTCGCCTATT GTTAAAGTGT GTCCTTTGTC GA 
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II. Predicted DNA sequence pGQl 

ctl-l promoter + coding region. 

o-GQl 

PstI * 



1 ATGACCATGA TTACGCCAAG CTTGCATGCC TGCAGCCAAT GCATTGGAAG 
TACTGGTACT AATGCGGTTC GAACGTACGG ACGTCGGTTA CGTAACCTTC 

ctl-1 promoter + coding region 



o-GQl 



51 AGATATTTTG CGCGTCAAAT ATGTTTTGTG TCCCCGTAAT ATTTTTTTAA 
TCTATAAAAC GCGCAGTTTA TACAAAACAC AGGGGCATTA TAAAAAAATT 

ctl-l promoter + coding region 



101 ATCAAATTTC ACATTTTAAC CATAAAAAAC TCTTTCAAAA GTGTAATTTT 
TAGTTTAAAG TGTAAAATTG GTATTTTTTG AGAAAGTTTT CACATTAAAA , 

ctl-l promoter + coding region 



151 ' CTACGCAAAA ATGCCGTTCG GATGAAAAAT TACTTTTGAA AAACAAACTC 
GATGCGTTTT TACGGCAAGC CTACTTTTTA ATGAAAACTT TTTGTTTGAG 

ctl-l promoter + coding region 



201 GAAACTACGG TACGCAAAAA AGTACATCGG TGTTTGCACA TAAGTGAAAA 
CTTTGATGCC ATGCGTTTTT TCATGTAGCC ACAAACGTGT ATTCACTTTT 

ctl-l promoter + coding region 



251 CAATGTTGTT TTTTTGTAAT TAAAATCGAT TAATTTTTTT TCCCGGAAAA 
GTTACAACAA AAAAACATTA ATTTTAGCTA ATTAAAAAAA AGGGCCTTTT 

ctl-l promoter + coding region 

301 CAAAAACGTT TTCAGCGTGG ATTTCTATTG TTTCTTGCGT AAAAAAAAAT 
GTTTTTGCAA AAGTCGCACC TAAAGATAAC AAAGAACGCA TTTTTTTTTA 

ctl-l promoter + coding region 

351 TATTTACCAA TTTTAAACGA TAATTTCCAC GAATTTTCGC CATTAATCTC 
ATAAATGGTT AAAATTTGCT ATTAAAGGTG CTTAAAAGCG GTAATTAGAG 

ctl-l promoter + coding region 

401 TCGATTTTGT TGATTCTTGA CTCCGAGCAA TCTCTCCGGT TTTCGCAAAC 
AGCTAAAACA ACTAAGAACT GAGGCTCGTT AGAGAGGCCA AAAGCGTTTG 
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ctl-1 promoter + coding region 



451 GATTATATTA TTTATTTGTT TTCCTTTTCA GTGCCGATTC TCGGAAATTC 
CTAATATAAT AAATAAACAA AAGGAAAAGT CACGGCTAAG AGCCTTTAAG 

ctl-1 promoter + coding region 

Exon 1 



501 AACAGTAAAT CTTCAAAATG CCAATGCTTC CCCACATGGT CAATCTAAGT 
TTGTCATTTA GAAGTTTTAC GGTTACGAAG GGGTGTACCA GTTAGATTCA 

ctl-1 promoter + coding region 



551 GAGTTTCTTT GTTACAAAAT ACACGTGATG TCAGATTGTC TCATTTCGGT 
CTCAAAGAAA CAATGTTTTA TGTGCACTAC AGTCTAACAG AGTAAAGCCA 

ctl-1 promoter + coding region 

601 TTGATCTACG TAGATCTACA AAAAATGCGG GAATTGAGCC GCAGAGTTCT 
AACTAGATGC ATCTAGATGT TTTTTACGCC CTTAACTCGG CGTCTCAAGA 

ctl-1 promoter + coding region 

651 CAACTGCTTT CGCATGGTTA AGAACGTGCG GACGTCAAAT TGTTTTGGGC 
GTTGACGAAA GCGTACCAAT TCTTGCACGC CTGCAGTTTA ACAAAACCCG 

ctl-1 promoter + coding region 

701 AAAAATTCCC GCATTTTTTG TAGATCAAAC CGTAATGGGA CAGTCTGGCA 
TTTTTAAGGG CGTAAAAAAC ATCTAGTTTG GCATTACCCT GTCAGACCGT 



ctl-1 promoter + coding region 



Exon 2 



751 CCACGTGACT ATATATTTTT AGCGGTCAAC GACACAAAAC CCGGACCAAT 
GGTGCACTGA TATATAAAAA TCGCCAGTTG CTGTGTTTTG GGCCTGGTTA 

ctl-1 promoter + coding region 

Exon 2 

801 GGCTGAGGAT CAGCTGAAAG CTTATAGAGA TAGAAATCAG GTGAGAAAAA 
CCGACTCCTA GTCGACTTTC GAATATCTCT ATCTTTAGTC CACTCTTTTT 

ctl-1 promoter + coding region 

851 TCAATTTCAG CGATTTTCTT CGCAATTTAT ATAAAAACTG ATTTTTCCAG 
AGTTAAAGTC GCTAAAAGAA GCGTTAAATA TATTTTTGAG TAAAAAGGTC 

ctl-1 promoter + coding region 

Exon 3 partial 
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901 GAACCCCACC TGCTCACCAC ATCCAATGGA GCTCCGATCT ACTCGAAGAC 
CTTGGGGTGG ACGAGTGGTG TAGGTTACCT CGAGGCTAGA TGAGCTTCTG 



ctl-1 promoter + coding region 



Exon 3 partial 



951 CGCCGTGCTC ACCGCCGGAC GACGTGGTCC AATGCTAATG CAGGACATCG 
GCGGCACGAG TGGCGGCCTG CTGCACCAGG TTACGATTAC GTCCTGTAGC 

ctl-1 promoter + coding region 



Exon 3 partial 



1001 TTTATATGGA CGAGATGGCT CATTTCGATC GTGAACGCAT CCCGGAGCGT 
AAATATACCT GCTCTACCGA GTAAAGCTAG CACTTGCGTA GGGCCTCGCA 

ctl-1 promoter + coding region 

Exon 3 partial 



1051 GTCGTCCATG CCAAAGGTGG TGGTGCTCAT GGATACTTCG AGGTCACCCA 
CAGCAGGTAC GGTTTCCACC ACCACGAGTA CCTATGAAGC TCCAGTGGGT 

* ctl-1 promoter + coding region 

Exon 3 partial 



1101 TGACATCACC AAGTACTGTA AGGCCGATAT GTTCAACAAG GTCGGAAAAC 
ACTGTAGTGG TTCATGACAT TCCGGCTATA CAAGTTGTTC CAGCCTTTTG 

ctl-1 promoter + coding region. 

o-GQ2bis 

Exon 3 partial 



BamHI 

1151 AGACACCACT TCTCGTTCGT TTTTCAACGG TCGCTGGAGA ATCGGCCGGA 
TCTGTGGTGA AGAGCAAGCA AAAAGTTGCC AGCGACCTCT. TAGCCGGCCT 

ctl-1 promoter + coding region 

o-GQ2bisi 

ssssss 

Exon 3 partial synth. intron A 

BamHI " 

1201 TCCCCGGGAT TGGCCAAAGG ACCCAAAGGT ATGTTTCGAA TGATACTAAC 
AGGGGCCCTA ACCGGTTTCC TGGGTTTCCA TACAAAGCTT ACTATGATTG 
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synth . intron A 

1251 ATAACATAGA ACATTTTCAG GAGGACCCTT GGCTAGCGTC GACGGTACCA 
TATTGTATCT TGTAAAAGTC CTCCTGGGAA CCGATCGCAG CTGCCATGGT 



GFPI 



1301 TGGGGCGCGC CATGAGTAAA GGAGAAGAAC TTTTCACTGG AGTTGTCCCA 
ACCCCGCGCG GTACTCATTT CCTCTTCTTG AAAAGTGACC TCAACAGGGT 



GFPI 

1351 ATTCTTGTTG AATTAGATGG TGATGTTAAT GGGCACAAAT TTTCTGTCAG 
TAAGAACAAC TTAATCTACC ACTACAATTA CCCGTGTTTA AAAGACAGTC 

GFPI 



1401 TGGAGAGGGT GAAGGTGATG CAACATACGG AAAACTTACC CTTAAATTTA 
ACCTCTCCCA CTTCCACTAC GTTGTATGCC TTTTGAATGG GAATTTAAAT 



GFPI 

14 51 TTTGCACTAC TGGAAAACTA CCTGTTCCAT GGGTAAGTTT AAACATATAT 
AAACGTGATG ACCTTTTGAT GGACAAGGTA CCCATTCAAA TTTGTATATA 

GFPI I 

1501 ATACTAACTA ACCCTGATTA TTTAAATTTT CAGCCAACAC TTGTCACTAC 
TATGATTGAT TGGGACTAAT AAATTTAAAA GTCGGTTGTG AACAGTGATG 



GFPI I 



1551 TTTCTGTTAT GGTGTTCAAT GCTTCTCGAG ATACCCAGAT CATATGAAAC 
AAAGACAATA CCACAAGTTA CGAAGAGCTC TATGGGTCTA GTATACTTTG 

GFPII 



1601 GGCATGACTT TTTCAAGAGT GCCATGCCCG AAGGTTATGT ACAGGAAAGA 
CCGTACTGAA AAAGTTCTCA CGGTACGGGC TTCCAATACA TGTCCTTTCT 

GFPII 

1651 ACTATATTTT TCAAAGATGA CGGGAACTAC AAGACACGTA AGTTTAAACA 
TGATATAAAA AGTTTCTACT GCCCTTGATG TTGTGTGCAT TCAAATTTGT 

t ■ ■ GFPIII 

1701 GTTCGGTACT AACTAACCAT ACATATTTAA ATTTTCAGGT GCTGAAGTCA 
CAAGCCATGA TTGATTGGTA TGTATAAATT TAAAAGTCCA CGACTTCAGT 

GFPIII 



•1751 ' AGTTTGAAGG TGATACCCTT GTTAATAGAA TCGAGTTAAA AGGTATTGAT 
TCAAACTTCC ACTATGGGAA . CAATTATCTT AGCTCAATTT TCCATAACTA 
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GFPIII 



1801 TTTAAAGAAG ATGGAAACAT TCTTGGACAC AAATTGGAAT ACAACTATAA 
AAATTTCTTC TACCTTTGTA AGAACCTGTG TTTAACCTTA TGTTGATATT 



GFPIII 



1851 CTCACACAAT GTATACATCA TGGCAGACAA ACAAAAGAAT GGAATCAAAG 
GAGTGTGTTA CATATGTAGT ACCGTCTGTT TGTTTTCTTA CCTTAGTTTC 



GFPIII 



1901 TTGTAAGTTT AAACTTGGAC TTACTAACTA ACGGATTATA TTTAAATTTT 
AACATTCAAA TTTGAACCTG AATGATTGAT TGCCTAATAT AAATTTAAAA 



GFPIV 



1951 CAGAACTTCA AAATTAGACA CAACATTGAA GATGGAAGCG TTCAACTAGC 
GTCTTGAAGT TTTAATCTGT GTTGTAACTT CTACCTTCGC AAGTTGATCG 



GFPIV 



2001 AGACCATTAT CAACAAAATA CTCCAATTGG CGATGGCCCT GTCCTTTTAC 
TCTGGTAATA GTTGTTTTAT GAGGTTAACC GCTACCGGGA CAGGAAAATG 



GFPIV 



2051 CAGACAACCA TTACCTGTCC ACACAATCTG CCCTTTCGAA AGATCCCAAC 
GTCTGTTGGT AATGGACAGG TGTGTTAGAC GGGAAAGCTT TCTAGGGTTG 



GFPIV 



2101 GAAAAGAGAG ACCACATGGT CCTTCTTGAG TTTGTAACAG CTGCTGGGAT 
CTTTTCTCTC TGGTGTACCA GGAAGAACTC AAACATTGTC GACGACCCTA 



GFPIV 



Fsel 



2151 TACACATGGC ATGGATGAAC TATACAAATA GGGCCGGCCG AGCTCCGCAT 
ATGTGTACCG TACCTACTTG ATATGTTTAT CCCGGCCGGC TCGAGGCGTA 

unc-54 3' UTR 

2201 CGGCCGCTGT CATCAGATCG CCATCTCGCG CCCGTGCCTC TGACTTCTAA 
GCCGGCGACA GTAGTCTAGC GGTAGAGCGC GGGCACGGAG ACTGAAGATT 

unc-54 3* UTR 

2251 GTCCAATTAC TCTTCAACAT CCCTACATGC TCTTTCTCCC TGTGCTCCCA 
CAGGTTAATG AGAAGTTGTA GGGATGTACG AGAAAGAGGG ACACGAGGGT 

unc-54 3 f UTR 

2301 CCCCCTATTT TTGTTATTAT CAAAAAAACT TCTTCTTAAT TTCTTTGTTT 
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GGGGGATAAA AACAATAATA GTTTTTTTGA AGAAGAATTA AAGAAACAAA 

unc-54 3' UTR 



2351 TTTAGCTTCT TTTAAGTCAC CTCTAACAAT GAAATTGTGT AGATTCAAAA 
AAATCGAAGA AAATTCAGTG GAGATTGTTA CTTTAACACA TCTAAGTTTT 

unc-54 3' UTR 



2401 ATAGAATTAA TTCGTAATAA AAAGTCGAAA AAAATTGTGC TCCCTCCCCC 
TATCTTAATT AAGCATTATT TTTCAGCTTT TTTTAACACG AGGGAGGGGG 

unc-54 3' UTR 



2451 CATTAATAAT AATTCTATCC CAAAATCTAC ACAATGTTCT GTGTACACTT 
GTAATTATTA TTAAGATAGG GTTTTAGATG TGTTACAAGA CACATGTGAA 

unc-54 3 r UTR 

=========== ====== ===== ====== =========== == =— 

2501 CTTATGTTTT TTTTACTTCT GATAAATTTT TTTTGAAACA TCATAGAAAA 
GAATACAAAA AAAATGAAGA CTATTTAAAA AAAACTTTGT AGTATCTTTT 

unc-54 3' UTR 

2551 AACCGCACAC AAAATACCTT ATCATATGTT ACGTTTCAGT TTATGACCGC 
.TTGGCGTGTG TTTTATGGAA TAGTATACAA TGCAAAGTCA AATACTGGCG 

unc-54 3' UTR 



2601 AATTTTTATT TCTTCGCACG TCTGGGCCTC TCATGACGTC AAATCATGCT 
TTAAAAATAA AGAAGCGTGC AGACCCGGAG AGTACTGCAG TTTAGTACGA 

unc-54 3' UTR 



2651 CATCGTGAAA AAGTTTTGGA GTATTTTTGG AATTTTTCAA TCAAGTGAAA 
GTAGCACTTT TTCAAAACCT CATAAAAACC TTAAAAAGTT AGTTCACTTT 

unc-54 3' UTR 



2701 GTTTATGAAA TTAATTTTCC TGCTTTTGCT TTTTGGGGGT TTCCCCTATT 
CAAATACTTT AATTAAAAGG ACGAAAACGA AAAACCCCCA AAGGGGATAA 

unc-54 3 1 UTR 



2751 GTTTGTCAAG AGTTTCGAGG ACGGCGTTTT TCTTGCTAAA ATCACAAGTA 
CAAACAGTTC TCAAAGCTCC TGCCGCAAAA AGAACGATTT TAGTGTTCAT 



unc-54 3 f UTR 



2801 TTGATGAGCA CGATGCAAGA AAGATCGGAA GAAGGTTTGG GTTTGAGGCT 
AACTACTCGT GCTACGTTCT TTCTAGCCTT CTTCCAAACC CAAACTCCGA 

unc-54 3' UTR 
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2851 CAGTGGAAGG TGAGTAGAAG TTGATAATTT GAAAGTGGAG TAGTGTCTAT 
GTCACCTTCC ACTCATCTTC AACTATTAAA CTTTCACCTC ATCACAGATA 

unc-54 3' UTR 



2901 GGGGTTTTTG CCTTAAATGA CAGAATACAT TCCCAATATA CCAAACATAA 
CCCCAAAAAC GGAATTTACT GTCTTATGTA AGGGTTATAT GGTTTGTATT 

unc-54 3* UTR 

2*951 CTGTTTCCTA CTAGTCGGCC GTACGGGCCC TTTCGTCTCG CGCGTTTCGG 
GACAAAGGAT GATCAGCCGG CATGCCCGGG AAAGCAGAGC GCGCAAAGCC 

3001 TGATGACGGT GAAAACCTCT GACACATGCA GCTCCCGGAG ACGGTCACAG 
ACTACTGCCA CTTTTGGAGA CTGTGTACGT CGAGGGCCTC TGCCAGTGTC 

3051 CTTGTCTGTA AGCGGATGCC GGGAGCAGAC AAGCCCGTCA GGGCGCGTCA 
GAACAGACAT TCGCCTACGG CCCTCGTCTG TTCGGGCAGT CCCGCGCAGT 

3101 GCGGGTGTTG GCGGGTGTCG GGGCTGGCTT AACTATGCGG CATCAGAGCA 
CGCCCACAAC CGCCCACAGC CCCGACCGAA TTGATACGCC GTAGTCTCGT 

3151 GATTGTACTG AGAGTGCACC ATATGCGGTG TGAAATACCG CACAGATGCG 
CTAACATGAC TCTCACGTGG TATACGCCAC ACTTTATGGC GTGTCTACGC 

3201 TAAGGAGAAA ATACCGCATC AGGCGGCCTT AAGGGCCTCG TGATACGCCT 
ATTCCTCTTT TATGGCGTAG TCCGCCGGAA TTCCCGGAGC ACTATGCGGA 

3251 ATTTTTATAG GTTAATGTCA TGATAATAAT GGTTTCTTAG ACGTCAGGTG 
TAAAAATATC CAATTACAGT ACT AT TAT T A CCAAAGAATC TGCAGTCCAC 

3301 GCACTTTTCG GGGAAATGTG CGCGGAACCC CTATTTGTTT ATTTTTCTAA 
CGTGAAAAGC CCCTTTACAC GCGCCTTGGG GATAAACAAA TAAAAAGATT 

3351 ATACATTCAA ATATGTATCC GCTCATGAGA CAATAACCCT GATAAATGCT 
TATGTAAGTT TATACATAGG CGAGTACTCT GTTATTGGGA CTATTTACGA 

amp 



34 01 TCAATAATAT TGAAAAAGGA AGAGTATGAG TATTCAACAT TTCCGTGTCG 
AGTTATTATA ACTTTTTCCT TCTCATACTC ATAAGTTGTA AAGGCACAGC 

amp 



34 51 CCCTTATTCC CTTTTTTGCG GCATTTTGCC TTCCTGTTTT TGCTCACCCA 
GGGAATAAGG GAAAAAACGC CGTAAAACGG AAGGACAAAA ACGAGTGGGT 

amp 



3501 GAAACGCTGG TGAAAGTAAA AGATGCTGAA GATCAGTTGG GTGCACGAGT 
CTTTGCGACC ACTTTCATTT TCTACGACTT CTAGTCAACC CACGTGCTCA , 



amp 
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3551 GGGTTACATC GAACTGGATC TCAACAGCGG TAAGATCCTT GAGAGTTTTC 
CCCAATGTAG CTTGACCTAG AGTTGTCGCC ATTCTAGGAA CTCTCAAAAG 

amp 



3601 GCCCCGAAGA ACGTTTTCCA ATGATGAGCA CTTTTAAAGT TCTGCTATGT 
CGGGGCTTCT TGCAAAAGGT TACTACTCGT GAAAATTTCA AGACGATACA 

amp 



3651 GGCGCGGTAT TATCCCGTAT TGACGCCGGG CAAGAGCAAC TCGGTCGCCG 
CCGCGCCATA ATAGGGCATA ACTGCGGCCC GTTCTCGTTG AGCCAGCGGC 



amp 



3701 CATACACTAT TCTCAGAATG ACTTGGTTGA GTACTCACCA GTCACAGAAA 
GTATGTGATA AGAGTCTTAC TGAACCAACT CATGAGTGGT CAGTGTCTTT 



amp 



3751 AGCATCTTAC GGATGGCATG ACAGTAAGAG AATTATGCAG TGCTGCCATA 
TCGTAGAATG CCTACCGTAC TGTCATTCTC TTAATACGTC ACGACGGTAT 



amp 



3801 - ACCATGAGTG ATAACACTGC GGCCAACTTA CTTCTGACAA CGATCGGAGG 
TGGTACTCAC TATTGTGACG CCGGTTGAAT GAAGACTGTT GCTAGCCTCC 

amp 



3851 ACCGAAGGAG CTAACCGCTT TTTTGCACAA CATGGGGGAT CATGTAACTC 
TGGCTTCCTC GATTGGCGAA AAAACGTGTT GTACCCCCTA GTACATTGAG 

amp 

3901 GCCTTGATCG TTGGGAACCG GAGCTGAATG AAGCCATACC AAACGACGAG 
CGGAACTAGC AACCCTTGGC CTCGACTTAC TTCGGTATGG TTTGCTGCTC 

amp 



3951 CGTGACACCA CGATGCCTGT AGCAATGGCA ACAACGTTGC GCAAAGTATT 
GCACTGTGGT GCTACGGACA TCGTTACCGT TGTTGCAACG CGTTTGATAA 



amp 

4001 AACTGGCGAA CTACTTACTC TAGCTTCCCG GCAACAATTA ATAGACTGGA 
TTGACCGCTT GATGAATGAG ATCGAAGGGC CGTTGTTAAT TATCTGACCT 

amp 



'4051 TGGAGGCGGA TAAAGTTGCA GGACCACTTC TGCGCTCGGC CCTTCCGGCT 
ACCTCCGCCT ATTTCAACGT CCTGGTGAAG ACGCGAGCCG GGAAGGCCGA 



•amp 
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4101 GGCTGGTTTA TTGCTGATAA ATCTGGAGCC GGTGAGCGTG GGTCTCGCGG 
• CCGACCAAAT AACGACTATT TAGACCTCGG CCACTCGCAC CCAGAGCGCC 



amp 



4151 TATCATTGCA GCACTGGGGC CAGATGGTAA GCCCTGCCGT ATCGTAGTTA 
ATAGTAACGT CGTGACCCCG GTCTACCATT CGGGAGGGCA TAGCATCAAT 



amp 



4201 TCTACACGAC GGGGAGTCAG GCAACTATGG ATGAACGAAA TAGACAGATC 
AGATGTGCTG CCCCTCAGTC CGTTGATACC TACTTGCTTT ATCTGTCTAG 



amp 



4251 GCTGAGATAG GTGCCTCACT GATTAAGCAT TGGTAACTGT CAGACCAAGT 
CGACTCTATC CACGGAGTGA CTAATTCGTA ACCATTGACA GTCTGGTTCA 

4301 TTACTCATAT ATACTTTAGA TTGATTTAAA ACTTCATTTT TAATTTAAAA 
AATGAGTATA TATGAAATCT AACTAAATTT TGAAGTAAAA ATTAAATTTT 

4351 GGATCTAGGT GAAGATCCTT TTTGATAATC TCATGACCAA AATCCCTTAA 
CCTAGATCCA CTTCTAGGAA AAACTATTAG AGTACTGGTT TTAGGGAATT 

4401 - CGTGAGTTTT CGTTCCACTG AGCGTCAGAC CCCGTAGAAA AGATCAAAGG 
GCACTCAAAA GCAAGGTGAC TCGCAGTCTG GGGCATCTTT TCTAGTTTCC 

4451 ATCTTCTTGA GATCCTTTTT TTCTGCGCGT AATCTGCTGC TTGCAAACAA 
TAGAAGAACT CTAGGAAAAA AAGACGCGCA TTAGACGACG AACGTTTGTT 

4501 AAAAACCACC GCTACCAGCG GTGGTTTGTT TGCCGGATCA AGAGCTACCA 
TTTTTGGTGG CGATGGTCGC CACCAAACAA ACGGCCTAGT TCTCGATGGT 

4551 ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC AGAGCGCAGA TACCAAATAC 
TGAGAAAAAG GCTTCCATTG ACCGAAGTCG TCTCGCGTCT ATGGTTTATG 

4601 TGTCCTTCTA GTGTAGCCGT AGTTAGGCCA CCACTTCAAG AACTCTGTAG 
ACAGGAAGAT CACATCGGCA TCAATCCGGT GGTGAAGTTC TTGAGACATC 

4 651 CACCGCCTAC ATACCTCGCT CTGCTAATCC TGTTACCAGT GGCTGCTGCC 
GTGGCGGATG TATGGAGCGA GACGATTAGG ACAATGGTCA CCGACGACGG 

4701 AGTGGCGATA AGTCGTGTCT TACCGGGTTG GACTCAAGAC GATAGTTACC 
TCACCGCTAT TCAGCACAGA ATGGCCCAAC CTGAGTTCTG CTATCAATGG 

4751 GGATAAGGCG CAGCGGTCGG GCTGAACGGG GGGTTCGTGC ACACAGCCCA 
CCTATTCCGC GTCGCCAGCC CGACTTGCCC CCCAAGCACG TGTGTCGGGT 

' 4 801 GCTTGGAGCG AACGACCTAC ACCGAACTGA GATACCTACA GCGTGAGCAT 
CGAACCTCGC TTGCTGGATG TGGCTTGACT CTATGGATGT CGCACTCGTA 



4851 TGAGAAAGCG CCACGCTTCC CGAAGGGAGA AAGGCGGACA GGTATCCGGT 
ACTCTTTCGC GGTGCGAAGG GCTTCCCTCT TTCCGCCTGT CCATAGGCCA 
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4901 AAGCGGCAGG GTCGGAACAG GAGAGCGCAC GAGGGAGCTT CCAGGGGGAA 
TTCGCCGTCC CAGCCTTGTC CTCTCGCGTG CTCCCTCGAA GGTCCCCCTT 

4 951 ACGCCTGGTA TCTTTATAGT CCTGTCGGGT TTCGCCACCT CTGACTTGAG 
TGCGGACCAT AGAAATATCA GGACAGCCCA AAGCGGTGGA GACTGAACTC 

5001 CGTCGATTTT TGTGATGCTC GTCAGGGGGG CGGAGCCTAT GGAAAAACGC 
GCAGCTAAAA ACACTACGAG CAGTCCCCCC GCCTCGGATA CCTTTTTGCG 

5051 CAGCAACGCG GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG CCTTTTGCTC 
GTCGTTGCGC CGGAAAAATG CCAAGGACCG GAAAACGACC GGAAAACGAG 

5101 ACATGTTCTT TCCTGCGTTA TCCCCTGATT CTGTGGATAA CCGTATTACC 
TGTACAAGAA AGGACGCAAT AGGGGACTAA GACACCTATT GGCATAATGG 

5151 GCCTTTGAGT GAGCTGATAC CGCTCGCCGC AGCCGAACGA CCGAGCGCAG 
CGGAAACTCA CTCGACTATG GCGAGCGGCG TCGGCTTGCT GGCTCGCGTC 

5201 CGAGTCAGTG AGCGAGGAAG CGGAAGAGCG CCCAATACGC AAACCGCCTC 
GCTCAGTCAC TCGCTCCTTC GCCTTCTCGC GGGTTATGCG TTTGGCGGAG 

5251 TCCCCGCGCG TTGGCCGATT CATTAATGCA GCTGGCACGA CAGGTTTCCC ' 
AGGGGCGCGC AACCGGCTAA GTAATTACGT CGACCGTGCT GTCCAAAGGG 

5301 GACTGGAAAG CGGGCAGTGA GCGCAACGCA ATTAATGTGA GTTAGCTCAC 
CTGACCTTTC GCCCGTCACT CGCGTTGCGT TAATTACACT CAATCGAGTG 

5351 TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCGGCT CGTATGTTGT 
AGTAATCCGT GGGGTCCGAA ATGTGAAATA CGAAGGCCGA GCATACAACA 

5401 GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CT 
CACCTTAACA CTCGCCTATT GTTAAAGTGT GTCCTTTGTC GA 
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pCluc6 sequence: 



AUG 1UC+ 

1 ATGACTGCTC CAAAGAAGAA GCGTAAGGTA CCGGTAGAAA AAATGGAAGA 
TACTGACGAG GTTTCTTCTT CGCATTCCAT GGCCATCTTT TTTACCTTCT 

luc+ 



51 CGCCAAAAAC ATAAAGAAAG GCCCGGCGCC ATTCTATCCG CTGGAAGATG 
GCGGTTTTTG TATTTCTTTC CGGGCCGCGG TAAGATAGGC GACCTTCTAC 



luc+ 



101 GAACCGCTGG AGAGCAACTG CATAAGGCTA TGAAGAGATA CGCCCTGGTT 
CTTGGCGACC TCTCGTTGAC GTATTCCGAT ACTTCTCTAT GCGGGACCAA 



luc+ 



151 CCTGGAACAA TTGCTTTTAC AGATGCACAT ATCGAGGTGG ACATCACTTA 
GGACCTTGTT AACGAAAATG TCTACGTGTA TAGCTCCACC TGTAGTGAAT 



luc+ 



201 CGCTGAGTAC TTCGAAATGT CCGTTCGGTT GGCAGAAGCT ATGAAACGAT 
'GCGACTCATG AAGCTTTACA GGCAAGCCAA CCGTCTTCGA TACTTTGCTA 



luc+ 



251 ATGGGCTGAA TACAAATCAC AGAATCGTCG TATGCAGTGA AAACTCTCTT 
TACCCGACTT ATGTTTAGTG TCTTAGCAGC ATACGTCACT TTTGAGAGAA 



1UC+ 



301 CAATTCTTTA TGCCGGTGTT GGGCGCGTTA TTTATCGGAG TTGCAGTTGC 
GTTAAGAAAT ACGGCCACAA CCCGCGCAAT AAATAGCCTC AACGTCAACG 



luc+ 



351 GCCCGCGAAC GACATTTATA ATGAACGTGA ATTGCTCAAC AGTATGGGCA 
CGGGCGCTTG CTGTAAATAT TACTTGCACT TAACGAGTTG TCATACCCGT 

luc+ 



401 TTTCGCAGCC TACCGTGGTG TTCGTTTCCA AAAAGGGGTT GCAAAAAATT 
AAAGCGTCteG ATGGCACCAC AAGCAAAGGT TTTTCCCCAA CGTTTTTTAA 



luc+ 



451 TTGAACGTGC AAAAAAAGCT CCCAATCATC CAAAAAATTA TTATCATGGA 
AACTTGCACG TTTTTTTCGA GGGTTAGTAG GTTTTTTAAT AATAGTACCT 



1UC4- 



RNsnnrir) -wo oiq-vwcas i > 
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501 TTCTAAAACG GATTACCAGG GATTTCAGTC GATGTACACG TTCGTCACAT 
AAGATTTTGC CTAATGGTCC CTAAAGTCAG CTACATGTGC AAGCAGTGTA 

luc+ 



551 CTCATCTACC TCCCGGTTTT AATGAATACG ATTTTGTGCC AGAGTCCTTC 
GAGTAGATGG AGGGCCAAAA TTACTTATGC TAAAACACGG TCTCAGGAAG 

luc+ 



601 GATAGGGACA AGACAATTGC ACTGATCATG AACTCCTCTG GATCTACTGG 
CTATCCCTGT TCTGTTAACG TGACTAGTAC TTGAGGAGAC CTAGATGACC 

lucH- 



651 TCTGCCTAAA GGTGTCGCTC TGCCTCATAG AACTGCCTGC GTGAGATTCT 
AGACGGATTT CCACAGCGAG ACGGAGTATC TTGACGGACG CACTCTAAGA 

luc+ 

701 CGCATGCCAG AGATCCTATT TTTGGCAATC AAATCATTCC GGATACTGCG 
GCGTACGGTC TCTAGGATAA AAACCGTTAG TTTAGTAAGG CCTATGACGC * 



751 ATTTTAAGTG TTGTTCCATT CCATCACGGT TTTGGAATGT TTACTACACT 
TAAAATTCAC AACAAGGTAA GGTAGTGCCA AAACCTTACA AATGATGTGA 

luc+ 

801 CGGATATTTG ATATGTGGAT TTCGAGTCGT CTTAATGTAT AGATTTGAAG 
GCCTATAAAC TATACACCTA AAGCTCAGCA GAATTACATA TCTAAACTTC 



851 AAGAGCTGTT TCTGAGGAGC CTTCAGGATT ACAAGATTCA AAGTGCGCTG 
TTCTCGACAA AGACTCCTCG GAAGTCCTAA TGTTCTAAGT TTCACGCGAC 



901 CTGGTGCCAA CCCTATTCTC CTTCTTCGCC AAAAGCACTC TGATTGACAA 
GACCACGGTT GGGATAAGAG GAAGAAGCGG TTTTCGTGAG ACTAACTGTT 

luc+ 



951 ATACGATTTA TCTAATTTAC ACGAAATTGC TTCTGGTGGC GCTCCCCTCT 
TATGCTAAAT AGATTAAATG TGCTTTAACG AAGACCACCG CGAGGGGAGA 

luc+ 



1001 CTAAGGAAGT CGGGGAAGCG GTTGCCAAGA GGTTCCATCT GCCAGGTATC 
GATTCCTTCA GCCCCTTCGC CAACGGTTCT CCAAGGTAGA CGGTCCATAG 
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luc+ 



1051 


AGGCAAGGAT ATGGGCTCAC 


TGAGACTACA 


TCAGCTATTC 


TGATTACACC 




TCCGTTCCTA 


TACCCGAGTG 


ACTCTGATGT 


AGTCGATAAG 


ACTAATGTGG 




luc+ 










1101 


CGAGGGGGAT 


GATAAACCGG 


GCGCGGTCGG 


TAAAGTTGTT 


CCATTTTTTG 




GCTCCCCCTA 


CTATTTGGCC 


CGCGCCAGCC 


ATTTCAACAA 


GGTAAAAAAC 




luc+ 










1151 


AAGCGAAGGT 


TGTGGATCTG 


GATACCGGGA AAACGCTGGG CGTTAATCAA 




TTCGCTTCCA ACACCTAGAC 


CTATGGCCCT 


TTTGCGACCC 


GCAATTAGTT 




luc+ 










1201 


AGAGGCGAAC 


TGTGTGTGAG 


AGGTCCTATG 


ATTATGTCCG 


GTTATGTAAA 




TCTCCGCTTG 


ACACACACTC 


TCCAGGATAC 


TAATACAGGC 


CAATACATTT 




luc+ 










1251 


CAATCCGGAA 


GCGACCAACG 


CCTTGATTGA 


CAAGGATGGA 


TGGCTACATT. 




GTTAGGCCTT 


CGCTGGTTGC 


GGAACTAACT 


GTTCCTACCT 


ACCGATGTAA 




luc+ 






===========: 




1301 


CTGGAGACAT 


AGCTTACTGG 


GACGAAGACG 


AACACTTCTT 


CATCGTTGAC 




GACCTCTGTA 


TCGAATGACC 


CTGCTTCTGC 


TTGTGAAGAA 


GTAGCAACTG 




luc+ 











1351 CGCCTGAAGT CTCTGATTAA GTACAAAGGC TATCAGGTGG CTCCCGCTGA 
GCGGACTTCA GAGACTAATT CATGTTTCCG ATAGTCCACC GAGGGCGACT 



luc+ 



1401 ATTGGAATCC ATCTTGCTCC AACACCCCAA CATCTTCGAC GCAGGTGTCG 
TAACCTTAGG TAGAACGAGG TTGTGGGGTT GTAGAAGCTG CGTCCACAGC 

luc+ 

14 51 CAGGTCTTCC CGACGATGAC GCCGGTGAAC TTCCCGCCGC CGTTGTTGTT 
GTCCAGAAGG GCTGCTACTG CGGCCACTTG AAGGGCGGCG GCAACAACAA 

luc+ 1 



1501 TTGGAGCACG GAAAGACGAT GACGGAAAAA GAGATCGTGG ATTACGTCGC . 
AACCTCGTGC CTTTCTGCTA CTGCCTTTTT CTCTAGCACC TAATGCAGCG. 

luc+ 



1551 CAGTCAAGTA ACAACCGCGA AAAAGTTGCG CGGAGGAGTT GTGTTTGTGG 
GTCAGTTCAT TGTTGGCGCT TTTTCAACGC GCCTCCTCAA CACAAACACC 
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luc+ 

1601 ACGAAGTACC GAAAGGTCTT ACCGGAAAAC TCGACGCAAG AAAAATCAGA 

TGCTTCATGG CTTTCCAGAA TGGCCTTTTG AGCTGCGTTC TTTTTAGTCT 

luc+ 

1651 GAGATCCTCA TAAAGGCCAA GAAGGGCGGA AAGATCGCCG TGTAATTCTA 
CTCTAGGAGT ATTTCCGGTT CTTCCCGCCT TTCTAGCGGC ACATTAAGAT 

unc-54 3' UTR 

1701 GGAATTCCAA CTGAGCGCCG GTCGCTACCA TTACCAACTT GTCTGGTGTC . 
CCTTAAGGTT GACTCGCGGC CAGCGATGGT AATGGTTGAA CAGACCACAG 

unc-54 3 1 UTR 

1751 AAAAATAATA GGGGCCGCTG TCATCAGAGT AAGTTTAAAC TGAGTTCTAC 
TTTTTATTAT CCCCGGCGAC AGTAGTCTCA TTCAAATTTG ACTCAAGATG 



unc-54 3' UTR 

1801 TAACTAACGA GTAATATTTA AATTTTCAGC ATCTCGCGCC CGTGCCTCTG 
ATTGATTGCT CATTATAAAT TTAAAAGTCG TAGAGCGCGG GCACGGAGAC 

unc-54 3' UTR 



1851 ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG 
TGAAGATTCA GGTTAATGAG AAGTTGTAGG GATGTACGAG AAAGAGGGAC 

unc-54 3' UTR 



1901 TGCTCCCACC CCCTATTTTT GTTATTATCA AAAAAACTTC TTCTTAATTT 
ACGAGGGTGG GGGATAAAAA CAATAATAGT TTTTTTGAAG AAGAATTAAA 

unc-54 3' UTR 



1951 CTTTGTTTTT TAGCTTCTTT TAAGTCACCT CTAACAATGA AATTGTGTAG 
GAAACAAAAA ATCGAAGAAA ATTCAGTGGA GATTGTTACT TTAACACATC 

unc-54 3» UTR 

2001 ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC 
TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG 

t 

unc-54 3' UTR 



2051 CCTCCCCCCA TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT 
GGAGGGGGGT AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA 



unc-54 3' UTR 



2101 GTACACTTCT TATGTTTTTT TTACTTCTGA TAAATTTTTT TTGAAACATC 
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CATGTGAAGA ATACAAAAAA AATGAAGACT ATTTAAAAAA AACTTTGTAG 
unc-54 3' UTR 



2151 ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC GTTTCAGTTT 
TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG CAAAGTCAAA 

unc-54 3' UTR 



2201 ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA 
TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT 

unc-54 3' UTR 



2251 ATCATGCTCA TCGTGAAAAA GTTTTGGAGT ATTTTTGGAA TTTTTCAATC 
TAGTACGAGT AGCACTTTTT CAAAACCTCA TAAAAACCTT AAAAAGTTAG 

unc-54 3 1 UTR 



2301 AAGTGAAAGT TTATGAAATT AATTTTCCTG CTTTTGCTTT TTGGGGGTTT 
TTCACTTTCA AATACTTTAA TTAAAAGGAC GAAAACGAAA AACCCCCAAA 

unc-54 3' UTR 



2351 CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT 
, GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA 

unc-54 3 1 UTR 



2401 CACAAGTATT GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT 
GTGTTCATAA CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA 

unc-54 3 1 UTR 



2451 TTGAGGCTCA GTGGAAGGTG AGTAGAAGTT GATAATTTGA AAGTGGAGTA 
AACTCCGAGT CACCTTCCAC TCATCTTCAA CTATTAAACT TTCACCTCAT 

unc-54 3' UTR 



2501 GTGTCTATGG GGTTTTTGCC TTAAATGACA GAATACATTC CCAATATACC 
CACAGATACC CCAAAAACGG AATTTACTGT CTTATGTAAG GGTTATATGG 

unc-54 3' UTR MSC II 



2551 AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG 
TTTGTATTGA CAAAGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC 

2601 CGTTTCGGTG ATGACGGTGA AAACCTCTGA CACATGCAGC TCCCGGAGAC 
GCAAAGCCAC TACTGCCACT TTTGGAGACT GTGTACGTCG AGGGCCTCTG 

2651 GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA GCCCGTCAGG 
CCAGTGTCGA ACAGACATTC GCCTACGGCC CTCGTCTGTT CGGGCAGTCC 



2701 GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 
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CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT 

2751 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA 
AGTCTCGTCT AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT 

2801 CAGATGCGTA AGGAGAAAAT ACCGCATCAG GCGGCCTTAA GGGCCTCGTG 
GTCTACGCAT TCCTCTTTTA TGGCGTAGTC CGCCGGAATT CCCGGAGCAC 

2851 ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG TTTCTTAGAC 
TATGCGGATA AAAATATCCA ATTACAGTAC TATTATTACC AAAGAATCTG 

2 901 GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT 
CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA 

2 951 TTTTCTAAAT ACATTCAAAT ATGTATCCGC TCATGAGACA ATAACCCTGA 
AAAAGATTTA TGTAAGTTTA TACATAGGCG AGTACTCTGT TATTGGGACT 

amp 



3001 TAAATGCTTC AATAATATTG AAAAAGGAAG AGTATGAGTA TTCAACATTT 
ATTTACGAAG TTATTATAAC TTTTTCCTTC TCATACTCAT AAGTTGTAAA 

amp 



3051 CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG 
* GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC 

amp 



3101 CTCACCCAGA AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT 
GAGTGGGTCT TTGCGACCAC TTTCATTTTC TACGACTTCT AGTCAACCCA 

amp 



3151 GCACGAGTGG GTTACATCGA ACTGGATCTC AACAGCGGTA AGATCCTTGA 
CGTGCTCACC CAATGTAGCT TGACCTAGAG TTGTCGCCAT TCTAGGAACT 

amp 



3201 GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT TTTAAAGTTC 
CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA AAATTTCAAG 

amp 



3251 TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC 
ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT TCTCGTTGAG 

amp 



3301 GGTCGCCGCA TACACTATTC TCAGAATGAC TTGGTTGAGT ACTCACCAGT 
CCAGCGGCGT ATGTGATAAG AGTCTTACTG AACCAACTCA TGAGTGGTCA 



amp 



32/74 



3351 CACAGAAAAG CATCTTACGG ATGGCATGAC AGTAAGAGAA TTATGCAGTG 
GTGTCTTTTC GTAGAATGCC TACCGTACTG TCATTCTCTT AATACGTCAC 

amp 



34 01 CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG 
GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC 



amp 



3451 ATCGGAGGAC CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA 
TAGCCTCCTG GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT 

amp 

3501 TGTAACTCGC CTTGATCGTT GGGAACCGGA GCTGAATGAA GCCATACCAA 
ACATTGAGCG GAACTAGCAA CCCTTGGCCT CGACTTACTT CGGTATGGTT 

amp 



3551 ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC AACGTTGCGC 
TGCTGCTCGC ACTGTGGTGC TACGGACATC GTTACCGTTG TTGCAACGCG 

amp 



3601 AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT 
TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA 



amp 



3651 AGACTGGATG GAGGCGGATA AAGTTGCAGG ACCACTTCTG CGCTCGGCCC 
TCTGACCTAC CTCCGCCTAT TTCAACGTCC TGGTGAAGAC GCGAGCCGGG 

amp 

3701 TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG TGAGCGTGGG 
AAGGCCGACC GACCAAATAA CGACTATTTA GACCTCGGCC ACTCGCACCC 

amp 



3751 TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT 
AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA 

amp 



3801 CGTAGTTATC TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA 
GCATCAATAG ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT 

- - amp •■ - 



3851 GACAGATCGC TGAGATAGGT GCCTCACTGA TTAAGCATTG GTAACTGTCA 
CTGTCTAGCG ACTCTATCCA CGGAGTGACT AATTCGTAAC CATTGACAGT 



3901 GACCAAGTTT ACTCATATAT ACTTTAGATT GATTTAAAAC TTCATTTTTA 
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CTGGTTCAAA T GAG TAT AT A TGAAATCTAA CTAAATTTTG AAGTAAAAAT 

3951 ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA 
TAAATTTTCC TAGATCCACT TCTAGGAAAA ACTATTAGAG TACTGGTTTT 

4001 TCCCTTAACG TGAGTTTTCG TTCCACTGAG CGTCAGACCC CGTAGAAAAG 
AGGGAATTGC ACTCAAAAGC AAGGTGACTC GCAGTCTGGG GCATCTTTTC 

4051 ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA TCTGCTGCTT 
TAGTTTCCTA GAAGAACTCT AGGAAAAAAA GACGCGCATT AGACGACGAA . 

4101 GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG 
CGTTTGTTTT TTTGGTGGCG ATGGTCGCCA CCAAACAAAC GGCCTAGTTC 

4151 AGCTACCAAC TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA 
TCGATGGTTG AGAAAAAGGC TTCCATTGAC CGAAGTCGTC TCGCGTCTAT 

4201 CCAAATACTG TCCTTCTAGT GTAGCCGTAG TTAGGCCACC ACTTCAAGAA 
GGTTTATGAC AGGAAGATCA CATCGGCATC AATCCGGTGG TGAAGTTCTT 

4251 CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG TTACCAGTGG 
GAGACATCGT GGCGGATGTA TGGAGCGAGA CGATTAGGAC AATGGTCACC 

4301 CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA 
GACGACGGTC ACCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT 

4351 TAGTTACCGG ATAAGGCGCA GCGGTCGGGC TGAACGGGGG. GTTCGTGCAC 
ATCAATGGCC TATTCCGCGT CGCCAGCCCG ACTTGCCCCC CAAGCACGTG 

4401 ACAGCCCAGC TTGGAGCGAA CGACCTACAC CGAACTGAGA TACCTACAGC 
TGTCGGGTCG AACCTCGCTT GCTGGATGTG GCTTGACTCT ATGGATGTCG 

44 51 GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG 
CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC 

4501 TATCCGGTAA GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC 
ATAGGCCATT CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG 

4 551 AGGGGGAAAC GCCTGGTATC TTTATAGTCC TGTCGGGTTT CGCCACCTCT 
TCCCCCTTTG CGGACCATAG AAATATCAGG ACAGCCCAAA GCGGTGGAGA 

4 601 GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG GAGCCTATGG 
CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC CTCGGATACC 

4 651 AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC 
TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG 

4701 TTTTGCTCAC ATGTTCTTTC CTGCGTTATC CCCTGATTCT GTGGATAACC 
AAAACGAGTG TACAAGAAAG GACGCAATAG GGGACTAAGA CACCTATTGG 

4751 GTATTACCGC CTTTGAGTGA GCTGATACCG CTCGCCGCAG CCGAACGACC 
CATAATGGCG GAAACTCACT CGACTATGGC GAGCGGCGTC GGCTTGCTGG 

4801 GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA 
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CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT 

4 851 ACCGCCTCTC CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA 
TGGCGGAGAG GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT 

4 901 GGTTTCCCGA CTGGAAAGCG GGCAGTGAGC GCAACGCAAT TAATGTGAGT 
CCAAAGGGCT GACCTTTCGC CCGTCACTCG CGTTGGGTTA ATTACACTCA 

4 951 TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC TTCCGGCTCG 
ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG AAGGCCGAGC 

5001 TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT 
ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA 

5051 ATGACCATGA TTACGCCAAG CTGTAAGTTT AAACATGATC TTACTAACTA 
TACTGGTACT AATGCGGTTC GACATTCAAA TTTGTACTAG AATGATTGAT 

5101 ACTATTCTCA TTTAAATTTT CAGAGCTTAA AAATGGCTGA AATCACTCAC 
. TGATAAGAGT AAATTTAAAA GTCTCGAATT TTTACCGACT TTAGTGAGTG 

5151 AACGATGGAT ACGCTAACAA CTTGGAAATG AAATAAGCTT GCATGCCTGC 
TTGCTACCTA TGCGATTGTT GAACCTTTAC TTTATTCGAA CGTACGGACG 

vit-2 promoter 



StuI 



5201 AGGCCTTGGT CGACTCTAGA GGATCAAACT GTATTACTTG AAACAATTTA 
TCCGGAACCA GCTGAGATCT CCTAGTTTGA CATAATGAAC TTTGTTAAAT 

vit-2 promoter 



5251 GTTATATGTT TAGAACCCCT CATTCAAAAT TAATAGACAG GGCTCTCACC 
CAATATACAA ATCTTGGGGA GTAAGTTTTA ATTATCTGTC CCGAGAGTGG 

vit-2 promoter 



5301 GAATGTTGCA ATTTGTTTCT GATAAGGGTC ACAAAGCGGA GCGAATGCTT 
CTTACAACGT TAAACAAAGA CTATTCCCAG TGTTTCGCCT CGCTTACGAA 

vit-2 promoter 



5351 GAATGTGTCC ATCAATGAGC TTATCAATGC GCTAAAACGC TATAACTTCC 
CTTACACAGG TAGTTACTCG AATAGTTACG CGATTTTGCG ATATTGAAGG 

vit-2 promoter 



54 01 ATATGAAGTC AATCGAACAT ATGTCAATCT TTAGCCGTAT ATAAAGGTGC 
TATACTTCAG TTAGCTTGTA TACAGTTAGA AATCGGCATA TATTTCCACG. . 

vit-2 promoter exon 1 {in frame - partial) 



54 51 ACTGAAAACA GTCCAATCAC GGTTCAGCCA TGAGGTCGAT CCCCGGCCGG 
TGACTTTTGT CAGGTTAGTG CCAAGTCGGT ACTCCAGCTA GGGGCCGGCC 
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exon 1 (in frame - partial) synth. intron 



5501 GATTGGCCAA AGGACCCAAA GGTATGTTTC GAATGATACT AACATAACAT 
CTAACCGGTT TCCTGGGTTT CCATACAAAG CTTACTATGA TTGTATTGTA 

synth. intron 

5551 AGAACATTTT CAGGAGGACC CTTGGAGGGT ACCGGGGATT GGCCAAAGGA 
TCTTGTAAAA GTCCTCCTGG GAACCTCCCA TGGCCCCTAA CCGGTTTCCT 

5601 CCCAAAGGTA TGTTTCGAAT GATACTAACA TAACATAGAA CATTTTCAGG 
GGGTTTCCAT ACAAAGCTTA CTATGATTGT ATTGTATCTT GTAAAAGTCC 

Sad 



5651 



AGGACCCTTG CTTGGAGGGT ACCGAGCTCA GAAAAA 
TCCTGGGAAC GAACCTCCCA TGGCTCGAGT CTTTTT 
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IH. Predicted DNA sequence pGQ2 

NLS luc+ 

1 ATGACTGCTC CAAAGAAGAA GCGTAAGGTA CCGGTAGAAA AAATGGAAGA 
TACT GAC GAG GTTTCTTCTT CGCATTCCAT GGCCATCTTT TTTACCTTCT 

luc+ 

51 CGCCAAAAAC ATAAAGAAAG GCCCGGCGCC ATTCTATCCG CTGGAAGATG 
GCGGTTTTTG TATTTCTTTC CGGGCCGCGG TAAGATAGGC GACCTTCTAC 

luc+ 



101 GAACCGCTGG AGAGCAACTG CATAAGGCTA TGAAGAGATA CGCCCTGGTT 
CTTGGCGACC TCTCGTTGAC GTATTCCGAT ACTTCTCTAT GCGGGACCAA 

luc+ 



151 CCTGGAACAA TTGCTTTTAC AGATGCACAT ATCGAGGTGG ACATCACTTA 
GGACCTTGTT AACGAAAATG TCTACGTGTA TAGCTCCACC TGTAGTGAAT ■ 

. luc+ 

201 ' CGCTGAGTAC TTCGAAATGT CCGTTCGGTT GGCAGAAGCT ATGAAACGAT 
GCGACTCATG AAGCTTTACA GGCAAGCCAA CCGTCTTCGA TACTTTGCTA 

luc+ 

251 ATGGGCTGAA TACAAATCAC AGAATCGTCG TATGCAGTGA AAACTCTCTT 
TACCCGACTT ATGTTTAGTG TCTTAGCAGC ATACGTCACT TTTGAGAGAA 

luc+ 

301 CAATTCTTTA TGCCGGTGTT GGGCGCGTTA TTTATCGGAG TTGCAGTTGC 
GTTAAGAAAT ACGGCCACAA CCCGCGCAAT AAATAGCCTC AACGTCAACG 

luc+ 



351 GCCCGCGAAC GACATTTATA ATGAACGTGA ATTGCTCAAC AGTATGGGCA 
CGGGCGCTTG CTGTAAATAT TACTTGCACT TAACGAGTTG TCATACCCGT 

luc+ 

401 TTTCGCAGCC TACCGTGGTG TTCGTTTCCA AAAAGGGGTT GCAAAAAATT 
AAAGCGTCGG ATGGCACCAC AAGCAAAGGT TTTTCCCCAA CGTTTTTTAA 

•- * - luc+ 



451 TTGAACGTGC AAAAAAAGCT CCCAATCATC CAAAAAATTA TTATCATGGA 
• AACTTGCACG TTTTTTTCGA GGGTTAGTAG GTTTTTTAAT AATAGTACCT 
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luc+ 

ss=====g sg =sssss=a=sg=g===s !S " = " ====i: - ::,s===s!=s==;:ji;::s 

501 TTCTAAAACG GATTACCAGG GATTTCAGTC GATGTACACG TTCGTCACAT 
AAGATTTTGC CTAATGGTCC CTAAAGTCAG CTACATGTGC AAGCAGTGTA 

luc+ 



551 CTCATCTACC TCCCGGTTTT AATGAATACG ATTTTGTGCC AGAGTCCTTC 
GAGTAGATGG AGGGCCAAAA TTACTTATGC TAAAACACGG TCTCAGGAAG 

luc+ 



601 GATAGGGACA AGACAATTGC ACTGATCATG AACTCCTCTG GATCTACTGG 
CTATCCCTGT TCTGTTAACG TGACTAGTAC TTGAGGAGAC CTAGATGACC 

luc.+ 



651 TCTGCCTAAA GGTGTCGCTC TGCCTCATAG AACTGCCTGC GTGAGATTCT 
AGACGGATTT CCACAGCGAG ACGGAGTATC TTGACGGACG CACTCTAAGA 

luc+ 



701 CGCATGCCAG AGATCCTATT TTTGGCAATC AAATCATTCC GGATACTGCG 
GCGTACGGTC TCTAGGATAA AAACCGTTAG TTTAGTAAGG CCTATGACGC 

• luc+ 



751 ATTTTAAGTG TTGTTCCATT CCATCACGGT TTTGGAATGT TTACTACACT 
TAAAATTCAC AACAAGGTAA GGTAGTGCCA AAACCTTACA AATGATGTGA 

luc+ 



801 CGGATATTTG ATATGTGGAT TTCGAGTCGT CTTAATGTAT AGATTTGAAG 
GCCTATAAAC TATACACCTA AAGCTCAGCA GAATTACATA TCTAAACTTC 

luc+ 



851 AAGAGCTGTT TCTGAGGAGC CTTCAGGATT ACAAGATTCA AAGTGCGCTG 
TTCTCGACAA AGACTCCTCG GAAGTCCTAA TGTTCTAAGT TTCACGCGAC 



luc+ 



CTGGTGCCAA 


CCCTATTCTC 


CTTCTTCGCC 


=== — BSS — = — egcagaCBBiS 

AAAAGCACTC TGATTGACAA 


GACCACGGTT 


GGGATAAGAG 


GAAGAAGCGG 


TTTTCGTGAG ACTAACTGTT 


luc+ 








ATACGATTTA 


TCTAATTTAC 


ACGAAATTGG 


TTCTGGTGGC GCTCCCCTCT 


TATGCTAAAT 


AGATTAAATG 


TGCTTTAACG 


AAGACCACCG CGAGGGGAGA 


luc+' 









1001 CTAAGGAAGT CGGGGAAGCG GTTGCCAAGA GGTTCCATCT GCCAGGTATC 
GATTCCTTCA GCCCCTTCGC CAACGGTTCT CCAAGGTAGA CGGTCCATAG 
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luc+ 



1051 AGGCAAGGAT ATGGGCTCAC TGAGACTACA TCAGCTATTC TGATTACACC 
TCCGTTCCTA TACCCGAGTG ACTCTGATGT AGTCGATAAG ACTAATGTGG 

luc+ 



1101 CGAGGGGGAT GATAAACCGG GCGCGGTCGG TAAAGTTGTT CCATTTTTTG 
GCTCCCCCTA CTATTTGGCC CGCGCCAGCC ATTTCAACAA GGTAAAAAAC 



luc+ 



1151 AAGCGAAGGT TGTGGATCTG GATACCGGGA AAACGCTGGG CGTTAATCAA 
TTCGCTTCCA ACACCTAGAC CTATGGCCCT TTTGCGACCC GCAATTAGTT 



luc+ 



1201 AGAGGCGAAC TGTGTGTGAG AGGTCCTATG ATTATGTCCG GTTATGTAAA 
TCTCCGCTTG ACACACACTC TCCAGGATAC TAATACAGGC CAATACATTT 

luc+- 



1251 CAATCCGGAA GCGACCAACG CCTTGATTGA CA^GGATGGA TGGCTACATT . 
GTTAGGCCTT CGCTGGTTGC GGAACTAACT GTTCCTACCT ACCGATGTAA 

luc+ 



1301 CTGGAGACAT AGCTTACTGG GACGAAGACG AACACTTCTT CATCGTTGAC 
GACCTCTGTA TCGAATGACC CTGCTTCTGC TTGTGAAGAA GTAGCAACTG 

luc+ 

1351 CGCCTGAAGT CTCTGATTAA GTACAAAGGC TATCAGGTGG CTCCCGCTGA 
GCGGACTTCA GAGACTAATT CATGTTTCCG ATAGTCCACC GAGGGCGACT 

luc+ 

1401 ATTGGAATCC ATCTTGCTCC AACACCCCAA CATCTTCGAC GCAGGTGTCG ; 
TAACCTTAGG TAGAACGAGG TTGTGGGGTT GTAGAAGCTG CGTCCACAGC 



luc+ 



1451 


CAGGTCTTCC 
GTCCAGAAGG 
< 

luc+ 


CGACGATGAC 
GCTGCTACTG 


GCCGGTGAAC 
CGGCCACTTG 


TTCCCGCCGC 
AAGGGCGGCG 


CGTTGTTGTT 
GCAACAACAA 


1501 


TTGGAGCACG 
AACCTCGTGC 


GAAAGACGAT 
CTTTCTGCTA 


GACGGAAAAA 
CTGCCTTTTT 


GAGATCGTGG 
CTCTAGCACC 


ATTACGTCGC 
TAATGCAGCG 



1551 CAGTCAAGTA ACAACCGCGA AAAAGTTGCG CGGAGGAGTT GTGTTTGTGG 
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GTCAGTTCAT TGTTGGCGCT TTTTCAACGC GCCTCCTCAA CACAAACACC 
luc+ 



1601 ACGAAGTACC GAAAGGTCTT ACCGGAAAAC TCGACGCAAG AAAAATCAGA 
TGCTTCATGG CTTTCCAGAA TGGCCTTTTG AGCTGCGTTC TTTTTAGTCT 

luc+ 



1651 GAGATCCTCA TAAAGGCCAA GAAGGGCGGA AAGATCGCCG TGTAATTCTA 
CTCTAGGAGT ATTTCCGGTT CTTCCCGCCT TTCTAGCGGC ACATTAAGAT 

unc-54 3' UTR 



17 01 GGAATTCCAA CTGAGCGCCG GTCGCTACCA TTACCAACTT GTCTGGTGTC 
CCTTAAGGTT GACTCGCGGC CAGCGATGGT AATGGTTGAA C AG AC C AC AG 

unc-54 3' UTR 



1751 AAAAATAATA GGGGCCGCTG TCATCAGAGT AAGTTTAAAC TGAGTTCTAC 
TTTTTATTAT CCCCGGCGAC AGTAGTCTCA TTCAAATTTG ACTCAAGATG 

unc-54 3' UTR* 



1801 TAACTAACGA GTAATATTTA AATTTTCAGC ATCTCGCGCC CGTGCCTCTG 
. , ATTGATTGCT CATTATAAAT TTAAAAGTCG TAGAGCGCGG GCACGGAGAC 

unc-54 3' UTR 



1851 ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG 
TGAAGATTCA GGTTAATGAG AAGTTGTAGG GATGTACGAG AAAGAGGGAC 

unc-54 3' UTR 



1901 TGCTCCCACC CCCTATTTTT GTTATTATCA AAAAAACTTC TTCTTAATTT 
ACGAGGGTGG GGGATAAAAA CAATAATAGT TTTTTTGAAG AAGAATTAAA 

unc-54 3' UTR 



1.951 CTTTGTTTTT TAGCTTCTTT TAAGTCACCT CTAACAATGA . AATTGTGTAG 
GAAACAAAAA ATCGAAGAAA ATTCAGTGGA GATTGTTACT TTAACACATC 

unc-54 3' UTR 



2001 ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC 
TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG 

. unc-54 3' UTR 

2051 CCTCCCCCCA TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT 
GGAGGGGGGT AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA 



unc-54 3' UTR 
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2101 GTACACTTCT TATGTTTTTT TTACTTCTGA TAAATTTTTT TTGAAACATC 
CATGTGAAGA ATACAAAAAA AATGAAGACT ATTTAAAAAA AACTTTGTAG 

unc-54 3' UTR 

2151 ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC GTTTCAGTTT 
TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG CAAAGTCAAA 

unc-54 3' UTR 

2201 ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA 
TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT 

unc-54 3' UTR 

2251 ATCATGCTCA TCGTGAAAAA GTTTTGGAGT ATTTTTGGAA TTTTTCAATC 
TAGTACGAGT AGCACTTTTT CAAAACCTCA TAAAAACCTT AAAAAGTTAG 

unc-54 3 ! UTR 



2301 AAGTGAAA.GT TTATGAAATT AATTTTCCTG CTTTTGCTTT TTGGGGGTTT 
TTCACTTTCA AATACTTTAA TTAAAAGGAC GAAAACGAAA AACCCCCAAA 

unc-54 3' UTR 

2351 CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT 
GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA 

unc-54 3' UTR 



2401 CACAAGTATT GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT 
GTGTTCATAA CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA 

unc-54 3' UTR 

2451 TTGAGGCTCA GTGGAAGGTG AGTAGAAGTT GATAATTTGA AAGTGGAGTA 
AACTCCGAGT CACCTTCCAC TCATCTTCAA CTATTAAACT TTCACCTCAT 

unc-54 3' UTR 

2501 GTGTCTATGG GGTTTTTGCC TTAAATGACA GAATACATTC CCAATATACC 
CACAGATACC CCAAAAACGG AATTTACTGT CTTATGTAAG GGTTATATGG 

unc-54 3' UTR MSC II 



2551 AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG 
TTTGTATTGA CA\AGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC 

2601 CGTTTCGGTG ATGACGGTGA AAACCTCTGA CACATGCAGC TCCCGGAGAC 
GCAAAGCCAC TACTGCCACT TTTGGAGACT GTGTACGTCG AGGGCCTCTG 

2651 GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA GCCCGTCAGG 
CCAGTGTCGA ACAGACATTC GCCTACGGCC CTCGTCTGTT CGGGCAGTCC 
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2701 GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 
CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT 

2751 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA 
AGTCTCGTCT AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT 

2801 CAGATGCGTA AGGAGAAAAT ACCGCATCAG GCGGCCTTAA GGGCCTCGTG 
GTCTACGCAT TCCTCTTTTA TGGCGTAGTC CGCCGGAATT CCCGGAGCAC 

2851 ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG TTTCTTAGAC 
TATGCGGATA AAAATATCCA ATTACAGTAC TATTATTACC AAAGAATCTG 

2901 GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT 
CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA 

2951* TTTTCTAAAT ACATTCAAAT ATGTATCCGC T CAT GAG AC A ATAACCCTGA 
AAAAGATTTA TGTAAGTTTA TACATAGGCG AGTACTCTGT TATTGGGACT 

amp 

3001 TAAATGCTTC AATAATATTG AAAAAGGAAG AGTATGAGTA TTCAACATTT 
ATTTACGAAG TTATTATAAC TTTTTCCTTC TCATACTCAT AAGTTGTAAA > 

amp 



3051 /CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG 
GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC 

amp 



3101 CTCACCCAGA AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT 
GAGTGGGTCT TTGCGACCAC TTTCATTTTC TACGACTTCT AGTCAACCCA 

amp 



3151 GCACGAGTGG GTTACATCGA ACTGGATCTC AACAGCGGTA AGATCCTTGA 
CGTGCTCACC CAATGTAGCT TGACCTAGAG TTGTCGCCAT TCTAGGAACT 

amp 



3201 GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT TTTAAAGTTC 
CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA AAATTTCAAG 

amp 

3251 TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC 
ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT. TCTCGTTGAG 

amp 



3301 GGTCGCCGCA TACACTATTC TCAGAATGAC TTGGTTGAGT ACTCACCAGT 
CCAGCGGCGT ATGTGATAAG AGTCTTACTG AACCAACTCA TGAGTGGTCA 

amp 
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3351 CACAGAAAAG CATCTTACGG ATGGCATGAC AGTAAGAGAA TTATGCAGTG 
GTGTCTTTTC GTAGAATGCC TACCGTACTG TCATTCTCTT AATACGTCAC 

amp 



3401 CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG 
GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC 



amp 



3451 ATCGGAGGAC CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA 
TAGCCTCCTG GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT 



amp 



3501 TGTAACTCGC CTTGATCGTT GGGAACCGGA GCTGAATGAA GCCATACCAA 
ACATTGAGCG GAACTAGCAA CCCTTGGCCT CGACTTACTT CGGTATGGTT 



3551 ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC AACGTTGCGC _ 
TGCTGCTCGC ACTGTGGTGC TACGGACATC GTTACCGTTG TTGCAACGCG 



amp 

3601 ' AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT 
TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA 



amp 



3651 AGACTGGATG GAGGCGGATA AAGTTGCAGG ACCACTTCTG CGCTCGGCCC 
TCTGACCTAC CTCCGCCTAT TTCAACGTCC TGGTGAAGAC GCGAGCCGGG 



amp 



3701 TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG TGAGCGTGGG 
AAGGCCGACC GACCAAATAA CGACTATTTA GACCTCGGCC ACTCGCACCC 



amp 



3751 TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT 
AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA 



amp 



3801 CGTAGTTATC TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA 
GCATCAATAG ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT 

amp 



3851 GACAGATCGC TGAGATAGGT GCCTCACTGA TTAAGCATTG GTAACTGTCA 
CTGTCTAGCG ACTCTATCCA CGGAGTGACT AATTCGTAAC CATTGACAGT 
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3901 GACCAAGTTT ACT CAT AT AT ACTTTAGATT GATTTAAAAC TTCATTTTTA 
CTGGTTCAAA TGAGTATATA TGAAATCTAA CTAAATTTTG AAGTAAAAAT 

3951 ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA 
TAAATTTTCC TAGATCCACT TCTAGGAAAA ACTATTAGAG TACTGGTTTT 

40.01 TCCCTTAACG TGAGTTTTCG TTCCACTGAG CGTCAGACCC CGTAGAAAAG 
AGGGAATTGC ACTCAAAAGC AAGGTGACTC GCAGTCTGGG GCATCTTTTC 

4051 ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA TCTGCTGCTT 
TAGTTTCCTA GAAGAACTCT AGGAAAAAAA GACGCGCATT AGACGACGAA 

4101 GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG 
CGTTTGTTTT TTTGGTGGCG ATGGTCGCCA CCAAACAAAC GGCCTAGTTC 

4151 AGCTACCAAC TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA 
TCGATGGTTG AGAAAAAGGC TTCCATTGAC CGAAGTCGTC TCGCGTCTAT 

4 201 CCAAATACTG TCCTTCTAGT GTAGCCGTAG TTAGGCCACC ACTTCAAGAA 
GGTTTATGAC AGGAAGATCA CATCGGCATC AATCCGGTGG TGAAGTTCTT 

4 251 CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG TTACCAGTGG 
GAGACATCGT GGCGGATGTA TGGAGCGAGA CGATTAGGAC AATGGTCACC 

4 301 CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA 
, GACGACGGTC ACCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT 

4351 TAGTTACCGG ATAAGGCGCA GCGGTCGGGC TGAACGGGGG GTTCGTGCAC 
• ATCAATGGCC TATTCCGCGT CGCCAGCCCG ACTTGCCCCC CAAGCACGTG 

4401 ACAGCCCAGC TTGGAGCGAA CGACCTACAC CGAACTGAGA TACCTACAGC 
TGTCGGGTCG AACCTCGCTT GCTGGATGTG GCTTGACTCT ATGGATGTCG 

4451 GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG 
CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC 

4501 TATCCGGTAA GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC 
ATAGGCCATT CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG 

4551 AGGGGGAAAC GCCTGGTATC TTTATAGTCC TGTCGGGTTT CGCCACCTCT 
TCCCCCTTTG CGGACCATAG AAATATCAGG ACAGCCCAAA GCGGTGGAGA 

4 601 GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG GAGCCTATGG 
CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC CTCGGATACC 

4 651 AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC 
TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG 

4701 TTTTGCTCAC ATGTTCTTTC CTGCGTTATC CCCTGATTCT GTGGATAACC 
AAAACGAGTG TACAAGAAAG GACGCAATAG GGGACTAAGA CACCTATTGG 

4751 GTATTACCGC CTTTGAGTGA GCTGATACCG CTCGCCGCAG CCGAACGACC 
CATAATGGCG GAAACTCACT CGACTATGGC GAGCGGCGTC GGCTTGCTGG 
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4801 GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA 
CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT 

4851 ACCGCCTCTC CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA 
TGGCGGAGAG GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT 

4 901 GGTTTCCCGA CTGGAAAGCG GGCAGTGAGC GCAACGCAAT TAATGTGAGT 
CCAAAGGGCT GACCTTTCGC CCGTCACTCG CGTTGCGTTA ATTACACTCA 

4951 TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC TTCCGGCTCG 
ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG AAGGCCGAGC 

5001 TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT 
ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA 

5051 ATGACCATGA TTACGCCAAG CTGTAAGTTT AAACATGATC TTACTAACTA 
TACTGGTACT AATGCGGTTC GACATTCAAA TTTGTACTAG AATGATTGAT 

5101 ACTATTCTCA TTTAAATTTT CAGAGCTTAA AAATGGCTGA AATCACTCAC 
TGATAAGAGT AAATTTAAAA GTCTCGAATT TTTACCGACT TTAGTGAGTG 

5151 AACGATGGAT ACGCTAACAA CTTGGAAATG AAATAAGCTT GCATGCCTGC 
TTGCTACCTA TGCGATTGTT GAACCTTTAC TTTATTCGAA CGTACGGACG 

ctl-1 promoter + coding region 



0-GQ3 



StuI 



5201 AGGCCTGAGA TATTTTGCGC GTCAAATATG TTTTGTGTCC CCGTAATATT 
TCCGGACTCT ATAAAACGCG CAGTTTATAC AAAACACAGG GGCATTATAA 

ctl-1 promoter + coding region 



5251 TTTTTAAATC AAATTTCACA TTTTAACCAT AAAAAACTCT TTCAAAAGTG 
AAAAATTTAG TTTAAAGTGT AAAATTGGTA TTTTTTGAGA AAGTTTTCAC 

ctl-1 promoter + coding region 

5301 TAATTTTCTA CGCAAAAATG CCGTTCGGAT GAAAAATTAC TTTTGAAAAA 
ATTAAAAGAT GCGTTTTTAC GGCAAGCCTA CTTTTTAATG AAAACTTTTT 

ctl-1 promoter + coding region 



5351 CAAACTCGAA ACTACGGTAC GCAAAAAAGT ACATCGGTGT TTGCACATAA 
GTTTGAGCTT TGATGCCATG CGTTTTTTCA TGTAGCCACA AACGTGTATT 

ctl-1 promoter + coding region 



54 01 GTGAAAACAA TGTTGTTTTT TTGTAATTAA AATCGATTAA TTTTTTTTCC 
CACTTTTGTT AC AACAAAAA AACATTAATT TTAGCTAATT AAAAAAAAGG 

ctl-1 promoter + coding region 



DkichAAin ,\*ir\ r<. < rt->cc>A» ^ i - 
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5451 CGGAAAACAA AAACGTTTTC AGCGTGGATT TCTATTGTTT CTTGCGTAAA 
GCCTTTTGTT TTTGCAAAAG TCGCACCTAA AGATAACAAA GAACGCATTT 

ctl-1 promoter + coding region 



5501 AAAAAATTAT TTACCAATTT TAAACGATAA TTTCCACGAA TTTTCGCCAT 
TTTTTTAATA AATGGTTAAA ATTTGCTATT AAAGGTGCTT AAAAGCGGTA 

ctl-1 promoter + coding region 



5551 TAATCTCTCG ATTTTGTTGA TTCTTGACTC CGAGCAATCT CTCCGGTTTT 
ATTAGAGAGC TAAAACAAC? AAGAACTGAG GCTCGTTAGA GAGGCCAAAA 

ctl-1 promoter + coding region 



5601 CGCAAACGAT TATATTATTT ATTTGTTTTC CTTTTCAGTG CCGATTCTCG 
GCGTTTGCTA ATATAATAAA TAAACAAAAG GAAAAGTCAC GGCTAAGAGC 



ctl-1 promoter + coding region 



Exon 1 



5651 GAAATTCAAC AGTAAATCTT CAAAATGCCA ATGCTTCCCC ACATGGTCAA 
CTTTAAGTTG TCATTTAGAA GTTTTACGGT TACGAAGGGG TGTACCAGTT 

ctl-1 promoter + coding region 

Exon 1 



5701 TCTAAGTGAG TTTCTTTGTT ACAAAATACA CGTGATGTCA GATTGTCTCA 
AGATTCACTC AAAGAAACAA TGTTTTATGT GCACTACAGT CTAACAGAGT 

ctl-1 promoter + coding region 



5751 TTTCGGTTTG ATCTACGTAG ATCTACAAAA AATGCGGGAA TTGAGCCGCA 
AAAGCCAAAC TAGATGCATC TAGATGTTTT TTACGCCCTT AACTCGGCGT 

ctl-1 promoter- + coding region 

5801 GAGTTCTCAA CTGCTTTCGG ATGGTTAAGA ACGTGCGGAC GTCAAATTGT 
CTCAAGAGTT GACGAAAGCG TACCAATTCT TGCACGCCTG CAGTTTAACA 

ctl-1 promoter + coding region 



5851 TTTGGGCAAA AATTCCCGCA TTTTTTGTAG ATCAAACCGT AATGGGACAG 
AAACCCGTTT TTAAGGGCGT AAAAAACATC TAGTTTGGCA TTACCCTGTC 

ctl-1 promoter + coding region 

Exon 2 

5901 TCTGGCACCA CGTGACTATA TATTTTTAGC GGTCAACGAC ACAAAACCCG 
AGACCGTGGT GCACTGATAT ATAAAAATCG CCAGTTGCTG TGTTTTGGGC 
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Cow Kni^d 
ctl-1 promoter + coding region 



Exon 2 



5951 GACCAATGGC TGAGGATCAG CTGAAAGCTT ATAGAGATAG AAATCAGGTG 
CTGGTTACCG ACTCCTAGTC GACTTTCGAA TATCTCTATC TTTAGTCCAC 

ctl-1 prompter + coding region 

6001 AGAAAAATCA ATTTCAGCGA TTTTCTTCGC AATTTATATA AAAACTGATT 
TCTTTTTAGT TAAAGTCGCT AAA&GAAGCG TTAAATATAT TTTTGACTAA 

ctl-1 promoter + coding region 

o-GQ4 



Exon 3 



SacI 



6051 TTTCCAGGAA CCCCACCTGC TCACCACATC CAATCGGAGC TCAGAAAAA 
AAAGGTCCTT GGGGTGGACG AGTGGTGTAG GTTAGCCTCG AGTCTTTTT 
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Predicted DNA sequence 



OGQ6 GFPI 
AscI 

1 CGCGCCATGA GTAAAGGAGA AGAACTTTTC ACTGGAGTTG TCCCAATTCT 
GCGCGGTACT CATTTCCTCT TCTTGAAAAG TGACCTCAAC AGGGTTAAGA 

GFPI 

51 TGTTGAATTA GATGGTGATG TTAATGGGCA CAAATTTTCT GTCAGTGGAG 
ACAACTTAAT CTACCACTAC AATTACCCGT GTTTAAAAGA CAGTCACCTC 

GFPI 

101 AGGGTGAAGG TGATGCAACA TACGGAAAAC TTACCCTTAA ATTTATTTGC 
TCCCACTTCC ACTACGTTGT ATGCCTTTTG AATGGGAATT TAAATAAACG 

GFPI 



151 ACTACTGGAA AACTACCTGT TCCATGGGTA AGTTTAAACA TATATATACT 
TGATGACCTT TTGATGGACA AGGTACCCAT TCAAATTTGT ATATATATGA 

GFPII 

201 AACTAACCCT GATTATTTAA ATTTTCAGCC AACACTTGTC ACTACTTTCT 
TTGATTGGGA CTAATAAATT TAAAAGTCGG TTGTGAACAG TGATGAAAGA 



GFPII 



251 GTTATGGTGT TCAATGCTTC TCGAGATACC C AG AT CAT AT GAAACGGCAT 
CAATACCACA AGTTACGAAG AGCTCTATGG GTCTAGTATA CTTTGCCGTA 

GFPII 



301 GACTTTTTCA AGAGTGCCAT GCCCGAAGGT TATGTACAGG AAAGAACTAT 
CTGAAAAAGT TCTCACGGTA CGGGCTTCCA ATACATGTCC TTTCTTGATA 

GFPII 



351 ATTTTTCAAA GATGACGGGA ACTACAAGAC ACGTAAGTTT AAACAGTTCG 
TAAAAAGTTT CTACTGCCCT TGATGTTCTG TGCATTCAAA TTTGTCAAGC 

GFPIII 



401 GTACTAACTA ACCATACATA TTTAAATTTT CAGGTGCTGA AGTCAAGTTT 
CAT GAT T GAT TGGTATGTAT AAATTTAAAA GTCCACGACT TCAGTTCAAA 



GFPIII 
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451 GAAGGTGATA CCCTTGTTAA TAGAATCGAG TTAAAAGGTA TTGATTTTAA 
CTTCCACTAT GGGAACAATT ATCTTAGCTC AATTTTCCAT AACTAAAATT 

GFPIII 



501 AGAAGATGGA AACATTCTTG GACACAAATT GGAATACAAC TATAACTCAC 
TCTTCTACCT TTGTAAGAAC CTGTGTTTAA CCTTATGTTG ATATTGAGTG 



GFPIII 

551 ACAATGTATA CATCATGGCA GACAAACAAA AGAATGGAAT CAAAGTTGTA 
TGTTACATAT GTAGTACCGT CTGTTTGTTT TCTTACCTTA GTTTCAACAT 

GFPIV 

601 AGTTTAAACT TGGACTTACT AACTAACGGA TTATATTTAA ATTTTCAGAA 
TCAAATTTGA ACCTGAATGA TTGATTGCCT AATATAAATT TAAAAGTCTT 



GFPIV 

651 CTTCAAAATT AGACACAACA TTGAAGATGG AAGCGTTCAA CTAGCAGACC 
GAAGTTTTAA TCTGTGTTGT AACTTCTACC TTCGCAAGTT GATCGTCTGG ( 

GFPIV 



701 , ATTATCAACA AAATACTCCA ATTGGCGATG GCCCTGTCCT TTTACCAGAC 
TAATAGTTGT TTTATGAGGT TAACCGCTAC CGGGACAGGA AAATGGTCTG 



GFPIV 



751 AACCATTACC TGTCCACACA ATCTGCCCTT TCGAAAGATC CCAACGAAAA 
TTGGTAATGG ACAGGTGTGT TAGACGGGAA AGCTTTCTAG GGTTGCTTTT 

GFPIV 



801 GAGAGACCAC ATGGTCCTTC TTGAGTTTGT AACAGCTGCT GGGATTACAC 
CTCTCTGGTG TACCAGGAAG AACTCAAACA TTGTCGACGA CCCTAATGTG 

GFPIV Fsel 



851 ATGGCATGGA TGAACTATAC AAATAGGGCC GGCCGAGCTC CGCATCGGCC 
TACCGTACCT ACTTGATATG TTTATCCCGG CCGGCTCGAG GCGTAGCCGG 

unc-54 3' UTR 



901 GCTGTCATCA GATCGCCATC TCGCGCCCGT GCCTCTGACT TCTAAGTCCA 
CGACAGTAGT CTAGCGGTAG AGCGCGGGCA CGGAGACTGA AGATTCAGGT 

unc-54 3 1 UTR 



951 ATTACTCTTC AACATCCCTA CATGCTCTTT CTCCCTGTGC TCCCACCCCC 
TAATGAGAAG TTGTAGGGAT GTACGAGAAA GAGGGACACG AGGGTGGGGG 



unc-54 3 1 UTR 
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1001 TATTTTTGTT ATTATCAAAA AAACTTCTTC TTAATTTCTT TGTTTTTTAG 
ATAAAAACAA TAATAGTTTT TTTGAAGAAG AATTAAAGAA ACAAAAAATC 



unc-54 . 3 f UTR 



1051 CTTCTTTTAA GTCACCTCTA ACAATGAAAT TGTGTAGATT CAAAAATAGA 
GAAGAAAATT CAGTGGAGAT TGTTACTTTA ACACATCTAA GTTTTTATCT 

unc-54 3' UTR 



1101 ATTAATTCGT AATAAAAAGT CGAAAAAAAT TGTGCTCCCT CCCCCCATTA 
TAATTAAGCA TTATTTTTCA GCTTTTTTTA ACACGAGGGA GGGGGGTAAT 

unc-54 3* UTR 



1151 ATAATAATTC TATCCCAAAA TCTACACAAT GTTCTGTGTA CACTTCTTAT 
TATTATTAAG ATAGGGTTTT AGATGTGTTA CAAGACACAT GTGAAGAATA 

unc-54 3' UTR 



1201 GTTTTTTTTA CTTCTGATAA ATTTTTTTTG AAACATCATA GAAAAAACCG 
CAAAAAAAAT GAAGACTATT TAAAAAAAAC TTTGTAGTAT CTTTTTTGGC 

unc-54 3' UTR. 



1251 CACACAAAAT ACCTTATCAT ATGTTACGTT TCAGTTTATG ACCGCAATTT 
GTGTGTTTTA TGGAATAGTA TACAATGCAA AGTCAAATAC TGGCGTTAAA 

unc-54 3' UTR 

• — ■ — — — = t^==—z=-==^ =~==^~ iz=^===——===- — 

1301 TTATTTCTTC GCACGTCTGG GCCTCTCATG ACGTCAAATC ATGCTCATCG 
AATAAAGAAG CGTGCAGACC CGGAGAGTAC TGCAGTTTAG TACGAGTAGC 

unc-54 3' UTR 



1351 TGAAAAAGTT TTGGAGTATT TTTGGAATTT TTCAATCAAG TGAAAGTTTA 
ACTTTTTCAA AACCTCATAA AAACCTTAAA AAGTTAGTTC ACTTTCAAAT 

unc-54 3' UTR 



14 01 TGAAATTAAT TTTCCTGCTT TTGCTTTTTG GGGGTTTCCC CTATTGTTTG 
ACTTTAATTA AAAGGACGAA AACGAAAAAC CCCCAAAGGG GATAACAAAC 

unc-54 3' UTR 



14 51 TCAAGAGTTT CGAGGACGGC GTTTTTCTTG CTAAAATCAC AAGTATTGAT 
AGTTCTCAAA GCTCCTGCCG CAftAAAGAAC GATTTTAGTG TTCATAACTA 

. unc-54 3 f UTR 



1501 



GAGCACGATG CAAGAAAGAT CGGAAGAAGG TTTGGGTTTG AGGCTCAGTG 
CTCGTGCTAC GTTCTTTCTA GCCTTCTTCC AAACCCAAAC TCCGAGTCAC 
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unc-54 3» UTR 



1551 GAAGGTGAGT AGAAGTTGAT AATTTGftAAG TGGAGTAGTG TCTATGGGGT 
CTTCCACTCA TCTTCAACTA TTAAACTTTC ACCTCATCAC AGATACCCCA 

unc-54 3' UTR 



1601 TTTTGCCTTA AATGACAGAA TACATTCCCA ATATACCAAA CATAACTGTT 
AAAACGGAAT TTACTGTCTT ATGTAAGGGT TATATGGTTT GTATTGACAA 

unc-54 3 f UTR 

1651 TCCTACTAGT CGGCCGTACG GGCCCTTTCG TCTCGCGCGT TTCGGTGATG 
AGGATGATCA GCCGGCATGC CCGGGAAAGC AGAGCGCGCA AAGCCACTAC 

1701 ACGGTGAAAA CCTCTGACAC ATGCAGCTCC CGGAGACGGT CACAGCTTGT 
TGCCACTTTT GGAGACTGTG TACGTCGAGG GCCTCTGCCA GTGTCGAACA 

17 51 CTGTAAGCGG ATGCCGGGAG CAGACAAGCC CGTCAGGGCG CGTCAGCGGG 
GACATTCGCG TACGGCCCTC GTCTGTTCGG GCAGTCCCGC GCAGTCGCCC 

1801 TGTTGGCGGG TGTCGGGGCT GGCTTAACTA TGCGGCATCA GAGCAGATTG 
ACAACCGCCC ACAGCCCCGA CCGAATTGAT ACGCCGTAGT CTCGTCTAAC 

1851 TACTGAGAGT GCACCATATG CGGTGTGAAA TACCGCACAG ATGCGTAAGG 
ATGACTCTCA CGTGGTATAC GCCACACTTT ATGGCGTGTC TACGCATTCC 

1901 AGAAAATACC GCATCAGGCG GCCTTAAGGG CCTCGTGATA CGCCTATTTT 
TCTTTTATGG CGTAGTCCGC CGGAATTCCC GGAGCACTAT GCGGATAAAA 

1951 TATAGGTTAA TGTCATGATA ATAATGGTTT CTTAGACGTC AGGTGGCACT 
ATATCCAATT ACAGTACTAT TATTACCAAA GAATCTGCAG TCCACCGTGA 

2001 TTTCGGGGAA ATGTGCGCGG AACCCCTATT TGTTTATTTT TCTAAATACA 
AAAGCCCCTT TACACGCGCC TTGGGGATAA ACAAATAAAA AGATTTATGT 

2051 TTCAAATATG TATCCGCTCA TGAGACAAFA ACCCTGATAA ATGCTTCAAT 
AAGTTTATAC ATAGGCGAGT ACTCTGTTAT TGGGACTATT TACGAAGTTA 

amp 



2101 AATATTGAAA AAGGAAGAGT ATGAGTATTC AACATTTCCG TGTCGCCCTT 
TTATAACTTT TTCCTTCTCA TACTCATAAG TTGTAAAGGC. ACAGCGGGAA 

amp 



2151 ATTCCCTTTT TTGCGGCATT TTGCCTTCCT GTTTTTGCTC ACCCAGAAAC 
TAAGGGAAAA AACGCCGTAA AACGGAAGGA CAAAAACGAG TGGGTCTTTG 

• • amp 



2201 
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amp 



2251 ACATCGAACT GGATCTCAAC AGCGGTAAGA TCCTTGAGAG TTTTCGCCCC 
TGTAGCTTGA CCTAGAGTTG TCGCCATTCT AGGAACTCTC AAAAGCGGGG 

amp 

2301 GAAGAACGTT TTCCAATGAT GAGCACTTTT AAAGTTCTGC TATGTGGCGC 
CTTCTTGCAA AAGGTTACTA CTCGTGA^AA TTTCAAGACG ATACACCGCG 

amp 



2351 GGTATTATCC CGTATTGACG CCGGGCAAGA GCAACTCGGT CGCCGCATAC 
CCATAATAGG GCATAACTGC GGCCCGTTCT CGTTGAGCCA GCGGCGTATG 

amp 

2401 ACTATTCTCA GAATGACTTG GTT GAG TACT CACCAGTCAC AGAAAAGCAT 
• TGATAAGAGT CTTACTGAAC CAACTCATGA GTGGTCAGTG TCTTTTCGTA 

amp 



2451 CTTACGGATG GCATGACAGT AAGAGAATTA TGCAGTGCTG CCATAACCAT 
GAATGCCTAC CGTACTGTCA TTCTCTTAAT ACGTCACGAC GGTATTGGTA 

r amp 



2501 GAGTGATAAC ACTGCGGCCA ACTTACTTCT GACAACGATC GGAGGACCGA 
CTCACTATTG TGACGCCGGT TGAATGAAGA CTGTTGCTAG CCTCCTGGCT 

amp 

2551 AGGAGCTAAC CGCTTTTTTG CACAACATGG GGGATCATGT AACTCGCCTT 
TCCTCGATTG GCGAAAAAAC GTGTTGTACC CCCTAGTACA TTGAGCGGAA 

amp 



2601 GATCGTTGGG AACCGGAGCT GAATGAAGCC ATACCAAACG ACGAGCGTGA 
CTAGCAACCC TTGGCCTCGA CTTACTTCGG TATGGTTTGC TGCTCGCACT 

amp 

2651 CACCACGATG CCTGTAGCAA TGGCAACAAC GTTGCGCAAA CTATTAACTG 
GTGGTGCTAC GGACATCGTT ACCGTTGTTG CAACGCGTTT GATAATTGAC 

amp 1 



2701 GCGAACTACT TACTCTAGCT TCCCGGCAAC AATTAATAGA CTGGATGGAG 
CGCTTGATGA ATGAGATCGA AGGGCCGTTG TTAATTATCT GACCTACCTC 



amp 



2751 GCGGATAAAG TTGCAGGACC ACTTCTGCGC TCGGCCCTTC CGGCTGGCTG 
CGCCTATTTC AACGTCCTGG TGAAGACGCG AGCCGGGAAG GCCGACCGAC 
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amp 



2801 


GTTTATTGCT GATAAATCTG 
CAAATAACGA C TAT T TAG AC 


GAGCCGGTGA 
CTCGGCCACT 


GCGTGGGTCT CGCGGTATCA 
CGCACCCAGA GCGCCATAGT 




amp 








2851 


TTGCAGCACT GGGGCCAGAT 
AACGTCGTGA CCCCGGTCTA 

amp 


GGTAAGCCCT 
CCATTCGGGA 


CCCGTATCGT 
GGGCATAGCA 


AGTTATCTAC 
TCAATAGATG 



2901 ACGACGGGGA GTCAGGCAAC TATGGATGAA CGAAATAGAC AGATCGCTGA 
TGCTGCCCCT CAGTCCGTTG ATACCTACTT GCTTTATCTG TCTAGCGACT 



amp 



2951 GATAGGTGCC TCACTGATTA AGCATTGGTA ACTGTCAGAC CAAGTTTACT 
CTATCCACGG AGTGACTAAT TCGTAACCAT TGACAGTCTG GTTCAAATGA 

3001 CAT AT AT ACT TTAGATTGAT TTAAAACTTC ATTTTTAATT TAAAAGGATC 
GTATATATGA AATCTAACTA AATTTTGAAG TAAAAATTAA ATTTTCCTAG 

3051 TAGGTGAAGA TCCTTTTTGA TAATCTCATG ACCAAAATCC CTTAACGTGA 
ATCCACTTCT AGGAAAAACT ATTAGAGTAC TGGTTTTAGG GAATTGCACT 

3101 GTTTTCGTTC CACTGAGCGT CAGACCCCGT AGAAAAGATC AAAGGATCTT 
CAAAAGCAAG GTGACTCGCA GTCTGGGGCA TCTTTTCTAG TTTCCTAGAA 

3151 CTTGAGATCC TTTTTTTCTG CGCGTAATCT GCTGCTTGCA AACAAAAAAA 
GAACTCTAGG AAAAAAAGAC GCGCATTAGA CGACGAACGT TTGTTTTTTT 

3201 CCACCGCTAC CAGCGGTGGT TTGTTTGCCG GATCAAGAGC TACCAACTCT 
GGTGGCGATG GTCGCCACCA AACAAACGGC CTAGTTCTCG ATGGTTGAGA 

3251 TTTTCCGAAG GTAACTGGCT TCAGCAGAGC GCAGATACCA AATACTGTCC 
AAAAGGCTTC CATTGACCGA AGTCGTCTCG CGTCTATGGT TTATGACAGG 

3301 TTCTAGTGTA GCCGTAGTTA GGCCACCACT TCAAGAACTC TGTAGCACCG 
AAGATCACAT CGGCATCAAT CCGGTGGTGA AGTTCTTGAG ACATCGTGGC 

3351 CCTACATACC TCGCTCTGCT AATCCTGTTA CCAGTGGCTG CTGCCAGTGG 
GGATGTATGG AGCGAGACGA TTAGGACAAT GGTCACCGAC GACGGTCACC 

3401 CGATAAGTCG TGTCTTACCG GGTTGGACTC AAGACGATAG TTACCGGATA 
GCTATTCAGC ACAGAATGGC CCAACCTGAG TTCTGCTATC AATGGCCTAT 

3451 AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT CGTGCACACA GCCCAGCTTG 
TCCGCGTCGC CAGCCCGACT TGCCCCCCAA GCACGTGTGT CGGGTCGAAC 

3501 GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG AGCATTGAGA 
CTCGCTTGCT GGATGTGGCT TGACTCTATG GATGTCGCAC TCGTAACTCT 



Rwcinnr.tn <-wn 
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3551 AAGCGCCACG CTTCCCGAAG GGAGAAAGGC GGACAGGTAT CCGGTAAGCG 
TTCGCGGTGC GAAGGGCTTC CCTCTTTCCG CCTGTCCATA GGCCATTCGC 

3601 GCAGGGTCGG AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC 
CGTCCCAGCC TTGTCCTCTC GCGTGCTCCC TCGAAGGTCC CCCTTTGCGG 

3651 TGGTATCTTT ATAGTCCTGT CGGGTTTCGC CACCTCTGAC TTGAGCGTCG 
AC CAT AG AAA TATCAGGACA GCCCAAAGCG GTGGAGACTG AACTCGCAGC 

37 01 ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG CCTATGGAAA AACGCCAGCA 
TAAAAACACT ACGAGCAGTC CCCCCGCCTC GGATACCTTT TTGCGGTCGT 

3751 ACGCGGCCTT TTTACGGTTC CTGGCCTTTT GCTGGCCTTT TGCTCACATG 
TGCGCCGGAA AAATGCCAAG GACCGGAAAA CGACCGGAAA ACGAGTGTAC 

3801 TTCTTTCCTG CGTTATCCCC TGATTCTGTG GATAACCGTA TTACCGCCTT 
AAGAAAGGAC GCAATAGGGG ACTAAGACAC CTATTGGCAT AATGGCGGAA 

3851 TGAGTGAGCT GATACCGCTC GCCGCAGCCG AACGACCGAG CGCAGCGAGT 
ACTCACTCGA CTATGGCGAG CGGCGTCGGC TTGCTGGCTC GCGTCGCTCA 

3901 CAGTGAGCGA GGAAGCGGAA GAGCGCCCAA TACGCAAACC GCCTCTCCCC 
GTCACTCGCT CCTTCGCCTT CTCGCGGGTT ATGCGTTTGG CGGAGAGGGG 

3951 GCGCGTTGGC CGATTCATTA ATGCAGCTGG CACGACAGGT TTCCCGACTG 
, CGCGCAACCG GCTAAGTAAT TACGTCGACC GTGCTGTCCA AAGGGCTGAC 

4001 GAAAGCGGGC AGTGAGCGCA ACGCAATTAA TGTGAGTTAG CTCACTCATT 
CTTTCGCCCG TCACTCGCGT TGCGTTAATT ACACTCAATC GAGTGAGTAA 

4051 AGGCACCCCA GGCTTTACAC TTTATGCTTC CGGCTCGTAT GTTGTGTGGA 
TCCGTGGGGT CCGAAATGTG AAATACGAAG GCCGAGCATA CAACACACCT 

4101 ATTGTGAGCG GATAACAATT TCACACAGGA AACAGCTATG ACCATGATTA 
TAACACTCGC CTATTGTTAA AGTGTGTCCT TTGTCGATAC TGGTACTAAT 

sod- 3 promoter + coding sequence 



PstI 



4151 CGCCAAGCTT GCATGCCTGC AGTGATTCAG AGAGGTTGAG AATTATTTTC 
GCGGTTCGAA CGTACGGACG TCACTAAGTC TCTCCAACTC TTAATAAAAG 

sod- 3 promoter + coding sequence 



4201 AAAAACA7TC AATGTTTTCC CTTGGAGTGA CTATGCAAAT ATGAAAATGT 
TTTTTGTAAG TTACAAAAGG GAACCTCACT GATACGTTTA TACTTTTACA 

sod-3 promoter + coding sequence 



4251 TTTCCAAAAA TATTTGGATG CCCTGATAAA AAGTAGGTGA AATTTCGCAG 
AAAGGTTTTT ATAAACCTAC GGGACTATTT TTCATCCACT TTAAAGCGTC 



sod-3 promoter + coding sequence 
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fig, 20 CowW r\ved 



4301 GGGAACATCA TATTAAAATG TTGAATTTTT AGAAGAAATG GAAATGTTTG 
CCCTTGTAGT ATAATTTTAC AACTTAAAAA TCTTCTTTAC CTTTACAAAC 



sod-3 promoter + coding sequence 



4 351 TCGGTGGTAT GCTCGAATAT TTGAGATATT AT AT ATT T AC TGTTAAATCC 
AGCCACCATA CGAGCTTATA AACTCTATAA TATATAAATG ACAATTTAGG 



. sod-3 promoter + coding sequence 



4 4 01 GAAATTTTTG ACAAACGGAA AAAATTTGTG TCGAAATACT ACATTTTCGA 
CTTTAAAAAC TGTTTGCCTT TTTTAAACAC AGCTTTATGA TGTAAAAGCT 

sod-3 promoter + coding sequence 

4 451 TAACACAAAG GTACTTCCAT AACACTTATA AAAACTGTTT GACTATCTTA 
ATTGTGTTTC CATGAAGGTA TTGTGAATAT TTTTGACAAA CTGATAGAAT 

sod-3 promoter + coding sequence 

4501 TTTCAGGAAA AAAAAATCCA AGAATAAACA TTTTTCAGAA TTTGAACTTT 
AAAGTCCTTT TTTTTTAGGT TCTTATTTGT AAAAAGTCTT AAACTTGAAA 

sod-3 promoter + coding sequence 

4 551 CTAATGGCTG ATTAATAAAA CAAAGTTATA CAACTATTCA AAGCAGTTGC 
GATTACCGAC TAATTATTTT GTTTCAATAT GTTGATAAGT TTCGTCAACG 

sod-3 promoter + coding sequence 

4 601 TCAAT CTGGC ATTTTCTTGT GTTTTTTTTT GAATATTTCA TCAGCAAGAT 
AGTTAGACCG TAAAAGAACA CAAAAAAAAA CTTATAAAGT AGTCGTTCTA 

sod-3 promoter + coding sequence 



4 651 GTTGATAATT TTGTGTTAAT TCTAATTGTT TTCTACAATT TTTCAAACCG 
CAACTATTAA AACACAATTA AGATTAACAA AAGATGTTAA AAAGTTTGGC 

sod-3 promoter + coding sequence 



4701 AAAATTGACC TTTGACTTTG TTTACTTTGT TCTCGTGGGT TAACTGTTCA 
TTTTAACTGG AAACTGAAAC AAATGAAACA AGAGCACCCA ATTGACAAGT 

sod-3 promoter + coding sequence 

4751 CTGATTTCTA TTGCTGTTGA TGAGGTCTTT GATCAAATTT GTATTGTTTT 
GACTAAAGAT AACGACAACT ACTCCAGAAA CTAGTTTAAA CATAACAAAA 

• sod-3 promoter ■+ coding sequence 



4 801 TATACTGCAT ATTGCTTCAA TTCTAAATCA TCTAATATAT TGTCAAACAA 
. ATATGACGTA TAACGAAGTT AAGATTTAGT AG AT TAT AT A ACAGTTTGTT ; 
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sod-3 promoter 4 coding sequence 

4851 CTTCTTGTTT TTTTTTTCAT TCAAAACTTC TGCAAAAACG TTCTCTTAAC 
GAAGAACAAA AAAAAAAGTA AGTTTTGAAG ACGTTTTTGC AAGAGAATTG 

sod-3 promoter 4 coding sequence 

4 901 AAAGGTTCAC ACAACAACTC TCCTCTCCAT CTCTTTCTCT CAACAACAAT 
TTTCCAAGTG TGTTGTTGAG AGGAGAGGTA GAGAAAGAGA GTTGTTGTTA 

sod-3 promoter 4 coding sequence 

4 951 GTGCTGGCCT TGCATGTTTG CCAGTGCGGG TTGTTTACGC GTTTTCAAGA 
GACGACCGGA ACGTACAAAC GGTCACGCCC AACAAATGCG CAAAAGTTCT 

sod-3 promoter 4 coding sequence 



5001 TTTTTGGTCT CCTATCTAAC GTCCCGAAAT GCATTTTTTC CTTTCATTTG 
AAAAACCAGA GGATAGATTG CAGGGCTTTA CGTAAAAAAG GAAAGTAAAC 

sod-3 promoter 4 coding sequence 

5051 GTTTTTTTCT GTTCGAGAAA AGTGACCGTT TGTCAAATCT TCTAATTTTC 
CAAAAAAAGA CAAGCTCTTT TCACTGGCAA ACAGTTTAGA AGATTAAAAG 

, sod-3 promoter 4 coding sequence 

Exon 1 



5101 AGTGAATAAA ATGCTGCAAT CTACTGCTCG CACTGCTTCA AAGCTTGTTC 
TCACTTATTT TACGACGTTA GATGACGAGC GTGACGAAGT TTCGAACAAG 

sod-3 promoter 4 coding sequence 

Exon 1 



5151 AACCGGTTGC GGGGTAAGTC AAAATGAAAT TTTCGTTTAA AAATTGGTTT 
TTGGCCAACG CCCCATTCAG TTTTACTTTA AAAGCAAATT TTTAACCAAA 

sod-3 promoter 4 coding sequence 

5201 TTTTTGGTAT TATAGATAAA ACTTATACCA AAACAAAACA TATTTAGAAA 
AAAAACCATA ATATCTATTT TGAATATGGT TTTGTTTTGT ATAAATCTTT 

sod-3 promoter 4 coding sequence 

5251 AACTTTAATA GAGAATAATT GTTTAATAAT TAATTTTTGC AAGCTCCTTT 
TTGAAATTAT CTCTTATTAA CAAATTATTA ATTAAAAACG TTCGAGGAAA 

sod-3 promoter + coding sequence 

5301 . TAAATTAAGA CATCTAAAAC. AGTTTTCAGC TTGATTGTTT TAATGGTTTA 
ATTTAATTCT GTAGATTTTG TCAAAAGTCG AACTAACAAA ATTACCAAAT 
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f\<^. 20 Cow Kk\ ve^i 

sod-3 promoter + coding sequence 



5351 GAAAGCAATA TTTGTATTTT GTGTTAAACT GAAAATATCT AGGAAATACT 
CTTTCGTTAT AAACATAAAA CACAATTTGA CTTTTATAGA TCCTTTATGA 

sod-3 promoter + coding sequence 

5401 ACTTTTAAAA TATTTGAAAC TTGAAATTTT AAAATTCCAA ATAATTTTAC 
TGAAAATTTT ATAAACTTTG AACTTTAAAA TTTTAAGGTT TATTAAAATG 

sod-3 promoter + coding sequence 



54 51 TCATTTCCTA AAGTGTTTGA GTATTTGTAT CCTGTGCTGA CACCGAAATG 
AGTAAAGGAT TTCACAAACT CATAAACATA GGACACGACT GTGGCTTTAC 

sod-3 promoter + coding sequence 

5501 TTCTCAATTT TGGAAAAAAA AGATTTTTAT CCGTATCTTC AGTCTTACAA 
AAGAGTTAAA ACCTTTTTTT TCTAAAAATA GGCATAGAAG TCAGAATGTT 

sod-3 promoter + coding sequence 

Exon 2 



5551 TTTTTTTCAC CTTTTTTTTC ATTTCAGAGT TCTCGCCGTC CGCTCCAAGC 
AAAAAAAGTG GAAAAAAAAG TAAAGTCTCA AGAGCGGCAG GCGAGGTTCG 

sod-3 promoter + coding sequence 

Exon 2 

5601 ACACTCTCCC AGATCTCCCA TTCGACTATG CAGATTTGGA ACCTGTAATC 
TGTGAGAGGG TCTAGAGGGT AAGCTGATAC GTCTAAACCT TGGACATTAG 

sod-3 promoter + coding sequence 

Exon 2 

5651 AGCCATGAAA TCATGCAGCT TCATCATCAA AAGCATCATG CCACCTACGT 
TCGGTACTTT AGTACGTCGA AGTAGTAGTT TTCGTAGTAC GGTGGATGCA 

sod-3 promoter + coding sequence 

Exon 2 



5701 GAACAATCTC AATCAGATCG AGGAGAAACT TCACGAGGCT GTTTCGAAAG 
CTTGTTAGAG TTAGTCTAGC TCCTCTTTGA AGTGCTCCGA CAAAGCTTTC 

sod-3 promoter + coding sequence 

Exon 3 



5751 



GTTTTTTAAT CAGAAGATTT TGAAATGAAT TTTTTTTTTG GTATATAGGG 
CAAAAAATTA GTCTTCTAAA ACTTTACTTA AAAAAAAAAC CATATATCCC 
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sod-3 promoter + coding sequence 



Exon 3 



5801 AATCTAAAAG AAGCAATTGC TCTCCAACCA GCGCTGAAAT TCAATGGTGG 
TTAGATTTTC TTCGTTAACG AGAGGTTGGT CGCGACTTTA AGTTACCACC 

sod-3 promoter + coding sequence 

Exon 3 



5851 TGGACACATC AATCATTCTA TCTTCTGGAC CAACTTGGCT AAGGATGGTG 
ACCTGTGTAG TTAGTAAGAT AGAAGACCTG GTTGAACCGA TTCCTACCAC 

0GQ6 

sod-3 promoter + coding sequence 



Exon 3 

AscI 

5901 GAGAACCTTC AAAGGAGCTG ATGGACACTA TTAAGGCTTG G 
CTCTTGGAAG TTTCCTCGAC TACCTGTGAT AATTCCGAAC C 
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fa)- 71, 

II. Predicted DNA sequence 

sod-3 prom. + coding region 

.PstI 

1 GTGATTCAGA GAGGTTGAGA ATTATTTTCA AAAACATTCA ATGTTTTCCC 
CACTAAGTCT CTCCAACTCT TAATAAAAGT TTTTGTAAGT TACAAAAGGG 

sod-3 prom. + coding region 



51 TTGGAGTGAC TATGCAAATA TGAAAATGTT TTCCAAAAAT ATTTGGATGC 
AACCTCACTG ATACGTTTAT ACTTTTACAA AAGGTTTTTA TAAACCTACG 

sod-3 prom. + coding region 



101 CCTGATAAAA AGTAGGTGAA ATTTCGCAGG GGAACATCAT ATTAAAATGT 
GGACTATTTT TCATCCACTT TAAAGCGTCC CCTTGTAGTA TAATTTTACA 

sod- 3 prom. + coding region ■ 

151 TGAATTTTTA GAAGAAATGG AAATGTTTGT CGGTGGTATG CTCGAATATT 
ACTTAAAAAT CTTCTTTACC TTTACAAACA GCCACCATAC GAGCTTATAA 

sod-3 prom. + coding region 



201 TGAGATATTA TATATTTACT GTTAAATCCG AAATTTTTGA CAAACGGAAA 
ACTCTATAAT ATATAAATGA CAATTTAGGC TTTAAAAACT GTTTGCCTTT 

sod-3 prom. + coding region 



251 AAATTTGTGT CGAAATACTA CATTTTCGAT AACACAAAGG TACTTCCATA 
TTTAAACACA GCTTTATGAT GTAAAAGCTA TTGTGTTTCC ATGAAGGTAT 

sod-3 prom. + coding region 

301 ACACTTATAA AAACTGTTTG ACTATCTTAT TTCAGGAAAA AAAAATCCAA 
TGTGAATATT TTTGACAAAC TGATAGAATA AAGTCCTTTT TTTTTAGGTT 

sod-3 prom. + coding region 



351 GAATAAACAT TTTTCAGAAT TTGAACTTTC TAATGGCTGA TTAATAAAAC 
CTTATTTQTA AAAAGTCTTA AACTTGAAAG ATTACCGACT AATTATTTTG 

sod-3 prom. + coding region 



. 401 A^AGTTATAC AACTATTCAA AGCAGTTGCT CAATCTGGCA TTTTCTTGTG 
TTTCAATATG TTGATAAGTT TCGTCAACGA GTTAGACCGT AAAAGAACAC 



sod-3 prom. + coding region 
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4 51 TTTTTTTTTG AATATTTCAT CAGCAAGATG TTGATAATTT TGTGTTAATT 
AAAAAAAAAC TTATAAAGTA GTCGTTCTAC AACTATTAAA ACACAATTAA 

sod-3 prom. + coding region 

501 CTAATTGTTT TCTACAATTT TTCAAACCGA AAATTGACCT TTGACTTTGT 
GATTAACAAA AGATGTTAAA AAGTTTGGCT TTTAACTGGA AACTGAAACA 

sod-3 prom. + coding region 



551 TTACTTTGTT CTCGTGGGTT AACTGTTCAC TGATTTCTAT TGCTGTTGAT 
AATGAAACAA GAGCACCCAA TTGACAAGTG ACTAAAGATA ACGACAACTA 

sod-3 prom. + coding region 



601 GAGGTCTTTG ATCAAATTTG TATTGTTTTT ATACTGCATA TTGCTTCAAT 
CTCCAGAAAC TAGTTTAAAC ATAACAAAAA TATGACGTAT AACGAAGTTA 

sod-3 prom. + coding region 



651 TCTAAATCAT CTAATATATT GTCAAACAAC TTCTTGTTTT TTTTTTCATT 
AGATTTAGTA GAT TAT AT AA CAGTTTGTTG AAGAACAAAA AAAAAAGTAA 

sod-3 prom. + coding region 

701 , CAAAACTTCT GCAAAAACGT TCTCTTAACA AAGGTTCACA CAACAACTCT 
GTTTTGAAGA CGTTTTTGCA AGAGAATTGT TTCCAAGTGT GTTGTTGAGA 

sod-3 prom. + coding region 



7 51 CCTCTCCATC TCTTTCTCTC AACAACAATG TGCTGGCCTT GCATGTTTGC 
GGAGAGGTAG AGAAAGAGAG TTGTTGTTAC ACGACCGGAA CGTACAAACG 

sod-3 prom. + coding region 

801 CAGTGCGGGT TGTTTACGCG TTTTCAAGAT TTTTGGTCTC CTATCTAACG 
GTCACGCCCA ACAAATGCGC AAAAGTTCTA AAAACCAGAG GATAGATTGC 

sod-3 prom. + coding region 



851 TCCCGAAATG CATTTTTTCC TTTCATTTGG TTTTTTTCTG TTCGAGAAAA 
AGGGCTTTAC GTAAAAAAGG AAAGTAAACC AAAAAAAGAC AAGCTCTTTT 

sod-3 prom. + coding region 

Exon 1 



901 GTGACCGTTT GTCAAATCTT CTAATTTTCA GTGAATAAAA TGCTGCAATC 
CACTGGCAAA CAGTTTAGAA GATTAAAAGT CACTTATTTT ACGACGTTAG 

sod-3 prom. + coding region 

Exon 1 



WO 01/93669 PCT/1B01/01199 

63/74 



951 TACTGCTCGC ACTGCTTCAA AGCTTGTTCA ACCGGTTGCG GGGTAAGTCA 
ATGACGAGCG TGACGAAGTT TCGAACAAGT TGGCCAACGC CCCATTCAGT 

sod-3 prom. + coding region 



1001 AAATGAAATT TTCGTTTAAA AATTGGTTTT TTTTGGTATT ATAGATAAAA 
TTTACTTTAA AAGCAAATTT TTAACCAAAA AAAACCATAA TATCTATTTT 

sod-3 prom. + coding region 

1051 CTTATACCAA AACAAAACAT ATTTAGAAAA ACTTTAATAG AGAATAATTG 
GAATATGGTT TTGTTTTGTA TAAATCTTTT TGAAATTATC TCTTATTAAC 

sod-3 prom. + coding region 



1101 TTTAATAATT AATTTTTGCA AGCTCCTTTT. AAATTAAGAC ATCTAAAACA 
AAATTATTAA TTAAAAACGT TCGAGGAAAA TTTAATTCTG TAGATTTTGT 

sod-3 prom. + coding region 



1151 GTTTTCAGCT TGATTGTTTT AATGGTTTAG AAAGCAATAT TTGTATTTTG 
CAAAAGTCGA ACTAACAAAA TTACCAAATC TTTCGTTATA AACATAAAAC 

sod-3 prom. + coding region 



1201 , TGTTAAACTG AAAATATCTA GGAAATACTA CTTTTAAAAT ATTTGAAACT 
ACAATTTGAC TTTTATAGAT CCTTTATGAT GAAAATTTTA TAAAGTTTGA 

sod-3 prom. + coding region 



1251 TGAAATTTTA AAATTCCAAA TAATTTTACT CATTTCCTAA AGTGTTTGAG 
ACTTTAAAAT TTTAAGGTTT ATTAAAATGA GTAAAGGATT TCACAAACTC 

sod-3 prom. + coding region 



1301 TATTTGTATC CTGTGCTGAC ACCGAAATGT TCTCAATTTT GGAAAAAAAA 
ATAAACATAG GACACGACTG TGGCTTTACA AGAGTTAAAA CCTTTTTTTT 

sod-3 prom. + coding region 



1351 GATTTTTATC CGTATCTTCA GTCTTACAAT TTTTTTCACC TTTTTTTTCA 
CTAAAAATAG GCATAGAAGT CAGAATGTTA AAAAAAGTGG AAAAAAAAGT 

sod-3 prom. + coding region 

: Exon 2 

1401 TTTCAGAGTT CTCGCCGTCC GCTCCAAGCA CACTCTCCCA GATCTCCCAT 
AAAGTCTCAA GAGCGGCAGG CGAGGTTCGT GTGAGAGGGT CTAGAGGGTA 

sod-3 prom. + coding region 

Exon 2 
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1451 TCGACTATGC AGATTTGGAA CCTGTAATCA GCCATGAAAT CATGCAGCTT 
AGCTGATACG TCTAAACCTT GGACATTAGT CGGTACTTTA GTACGTCGAA 

sod- 3 prom. + coding region 

Exon 2 



1501 CATCATCAAA AGCATCATGC CACCTACGTG AACAATCTCA ATCAGATCGA 
GTAGTAGTTT TCGTAGTACG GTGGATGCAC TTGTTAGAGT TAGTCTAGCT 

sod-3 prom, '+ coding region 



Exon 2 



1551 GGAGAAACTT CACGAGGCTG TTTCGAAAGG TTTTTTAATC AGAAGATTTT 
CCTCTTTGAA GTGCTCCGAC AAAGCTTTCC AAAAAATTAG TCTTCTAAAA 

sod-3 prom. + coding region 

Exon 3 



1601 GAAATGAATT TTTTTTTTGG TATATAGGGA ATCTAAAAGA AGCAATTGCT 
CTTTACTTAA AAAAAAAACC ATATATCCCT TAGATTTTCT TCGTTAACGA 

sod-3 prom. + coding region 



Exon 3 



1651 CTCCAACCAG CGCTGAAATT CAATGGTGGT GGACACATCA ATCATTCTAT- 
GAGGTTGGTC GCGACTTTAA GTTACCACCA CCTGTGTAGT TAGTAAGATA 

OGQ8 

sod-3 prom. + coding region 



Exon . 3 

17 01 CTTCTGGACC AACTTGGCTA AGGATGGTGG AGAACCTTCA AAGGAGCTGA 
GAAGACCTGG TTGAACCGAT TCCTACCACC TCTTGGAAGT TTCCTCGACT 

0GQ8 

sod-3 prom. 4- coding region 



Exon 3 

SacI 



1751 TGGACACTAT TAAGCCGAGC TCAGAAAAAA TGACTGCTCC AAAGAAGAAG 
ACCTGTGATA ATTCGGCTCG AGTCTTTTTT ACTGACGAGG TTTCTTCTTC 

luc+ 



1801 



CGTAAGGTAC CGGTAGAAAA AATGGAAGAC GCCAAAAACA TAAAGAAAGG 



WO 01/93669 PCT/IB01/01199 

65/74 

GCATTCCATG GCCATCTTTT TTACCTTCTG CGGTTTTTGT ATTTCTTTCC 

luc+ 



1851 CCCGGCGCCA TTCTATCCGC TGGAAGATGG AACCGCTGGA GAGCAACTGC 
GGGCCGCGGT AAGATAGGCG ACCTTCTACC TTGGCGACCT CTCGTTGACG 

luc+ 



1901 ATAAGGCTAT GAAGAGATAC GCCCTGGTTC CTGGAACAAT TGCTTTTACA 
TATTCCGATA CTTCTCTATG CGGGACCAAG GACCTTGTTA ACGAAAATGT 

luc+ 

1951 GATGCACATA TCGAGGTGGA CATCACTTAC GCTGAGTACT TCGAAATGTC 
CTACGTGTAT AGCTCCACCT GTAGTGAATG CGACTCATGA AGCTTTACAG 

luc+ 

2001 CGTTCGGTTG GCAGAAGCTA TGAAACGATA TGGGCTGAAT ACAAATCACA 
GCAAGCCAAC CGTCTTCGAT ACTTTGCTAT ACCCGACTTA TGTTTAGTGT 

luc+ 



2051 GAATCGTCGT ATGCAGTGAA AACTCTCTTC AATTCTTTAT GCCGGTGTTG 
, CTTAGCAGCA TACGTCACTT TTGAGAGAAG TTAAGAAATA CGGCCACAAC 

luc+ 



2101 GGCGCGTTAT TTATCGGAGT TGCAGTTGCG CCCGCGAACG ACATTTATAA 
CCGCGCAATA AATAGCCTCA ACGTCAACGC GGGCGCTTGC TGTAAATATT 

luc+ 



2151 TGAACGTGAA TTGCTCAACA GTATGGGCAT TTCGCAGCCT ACCGTGGTGT 
ACTTGCACTT AACGAGTTGT CATACCCGTA AAGCGTCGGA TGGCACCACA 

luc+ 



2201 TCGTTTCCAA AAAGGGGTTG CAAAAAATTT TGAACGTGCA AAAAAAGCTC 
AGCAAAGGTT TTTCCCCAAC GTTTTTTAAA ACTTGCACGT TTTTTTCGAG 

luc+ 



2251 CCAATCATCC AAAAAATTAT TATCATGGAT TCTAAAACGG ATTACCAGGG 
GGTTAGTAGG TTTTTTAATA ATAGTACCTA AGATTTTGCC TAATGGTCCC 



2301 - ATTTCAGTCG ATGTACACGT TCGTCACATC TCATCTACCT CCCGGTTTTA 
TAAAGTCAGC TACATGTGCA AGCAGTGTAG AGTAGATGGA GGGCCAAAAT 



luc+ 
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2351 ATGAATACGA TTTTGTGCCA GAGTCCTTCG ATAGGGACAA GACAATTGCA 
TACTTATGCT AAAACACGGT CTCAGGAAGC TATCCCTGTT CTGTTAACGT 

luc+ 

24 01 CTGATCATGA ACTCCTCTGG ATCTACTGGT CTGCCTAAAG GTGTCGCTCT 
GACTAGTACT TGAGGAGACC TAGATGACCA GACGGATTTC CACAGCGAGA 

luc+ 

-= — j L — TT-. an. ~ — ■— ■ . — ■ ,. . — : — = ■ — s~ ■ i a— — ' ss 

24 51 GCCTCATAGA ACTGCCTGCG TGAGATTCTC GCATGCCAGA GATCCTATTT 
CGGAGTATCT TGACGGACGC ACTCTAAGAG CGTACGGTCT CTAGGATAAA 

luc+ 

2501 TTGGCAATCA AATCATTCCG GATACTGCGA TTTTAAGTGT TGTTCCATTC 
AACCGTTAGT TTAGTAAGGC CTATGACGCT AAAATTCACA ACAAGGTAAG 

luc+ 

2551 CATCACGGTT TTGGAATGTT TACTACACTC GGATATTTGA TATGTGGATT 
GTAGTGCCAA AACCXTACAA ATGATGTGAG CCTATAAACT ATACACCTAA 

luc+ 

2 601 , TCGAGTCGTC TTAATGTATA GATTTGAAGA AGAGCTGTTT CTGAGGAGCC 
AGCTCAGCAG AATTACATAT CTAAACTTCT TCTCGACAAA GACTCCTCGG 

luc+ 

2651 TTCAGGATTA CAAGATTCAA AGTGCGCTGC TGGTGCCAAC CCTATTCTCC 
AAGTCCTAAT GTTCTAAGTT TCACGCGACG ACCACGGTTG GGATAAGAGG 

luc+ 

. _ = ^_^^^ _ _ s _ T . r _ 7 . ■■ — - 

2701 . TTCTTCGCCA AAAGCACTCT GATTGACAAA TACGATTTAT CTAATTTACA 
AAGAAGCGGT TTTCGTGAGA CTAACTGTTT ATGCTAAATA GATTAAATGT 

luc+ 

2751 CGAAATTGCT TCTGGTGGCG CTCCCCTCTC TAAGGAAGTC GGGGAAGCGG 
GCTTTAACGA AGACCACCGC GAGGGGAGAG ATTCCTTCAG CCCCTTCGCC 

luc+ 

2801 TTGCCAAGAG GTTCCATCTG CCAGGTATCA GGCAAGGATA TGGGCTCACT 
AACGGTTCTC CAAGGTAGAC GGTCCATAGT CCGTTCCTAT ACCCGAGTGA 

■luc+ - - ■ 

2851 GAGACTACAT CAGCTATTCT GATTACACCC GAGGGGGATG ATAAACCGGG 
CTCTGATGTA GTCGATAAGA CTAATGTGGG CTCCCCCTAC TATTTGGCCC 

luc+ 
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2901 CGCGGTCGGT AAAGTTGTTC CATTTTTTGA AGCGAAGGTT GTGGATCTGG 
GCGCCAGCCA TTTCAACAAG GTAAAAAACT TCGCTTCCAA CACCTAGACC 



luc+ 



2951 ATACCGGGAA AACGCTGGGC GTTAATCAAA GAGGCGAACT GTGTGTGAGA 
TATGGCCCTT TTGCGACCCG CAATTAGTTT CTCCGCTTGA CACACACTCT 



luc+ 



3001 GGTCCTATGA TTATGTCCGG TTATGTAAAC AATCCGGAAG CGACCAACGC 
CCAGGATACT AATACAGGCC AATACATTTG TTAGGCCTTC GCTGGTTGCG 



1UC+ 



3051 CTTGATTGAC AAGGATGGAT GGCTACATTC TGGAGACATA GCTTACTGGG 
GAACTAACTG TTCCTACCTA CCGATGTAAG ACCTCTGTAT CGAATGACCC 



luc+ 



3101 ACGAAGACGA ACACTTCTTC ATCGTTGACC GCCTGAAGTC TCTGATTAAG _ 
TGCTTCTGCT TGTGAAGAAG TAGCAACTGG CGGACTTCAG AGACTAATTC ' 



luc+ 



3151 TACAAAGGCT ATCAGGTGGC TCCCGCTGAA TTGGAATCCA TCTTGCTCCA 
ATGTTTCCGA TAGTCCACCG AGGGCGACTT AACCTTAGGT AGAACGAGGT 



luc+ 



3201 ACACCCCAAC ATCTTCGACG CAGGTGTCGC AGGTCTTCCC GACGATGACG 
TGTGGGGTTG TAGAAGCTGC GTCCACAGCG TCCAGAAGGG CTGCTACTGC 



luc+ 



3251 CCGGTGAACT TCCCGCCGCC GTTGTTGTTT TGGAGCACGG AAAGACGATG 
GGCCACTTGA AGGGCGGCGG CAACAACAAA ACCTCGTGCC TTTCTGCTAC 



luc+ 



3301 ACGGAAAAAG AGATCGTGGA TTACGTCGCC AGTCAAGTAA CAACCGCGAA 
TGCCTTTTTC TCTAGCACCT AATGCAGCGG TCAGTTCATT GTTGGCGCTT 



luc+ 



3351 AAAGTTGCGC GGAGGAGTTG TGTTTGTGGA CGAAGTACCG AAAGGTCTTA 
TTTCAACGCG CCTCCTCAAC . ACAAACACCT GCTTCATGGC TTTCCAGAAT 



luc+ 



34 01 CCGGAAAACT CGACGCAAGA AAAATCAGAG AGATCCTCAT AAAGGCCAAG 
GGCCTTTTGA GCTGCGTTCT TTTTAGTCTC TCTAGGAGTA TTTCCGGTTC 
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l uc+ unc-54 3 1 UTR 

3451 AAGGGCGGAA AGATCGCCGT GTAATTCTAG GAATTCCAAC TGAGCGCCGG 
TTCCCGCCTT TCTAGCGGCA CATTAAGATC CTTAAGGTTG ACTCGCGGCC 

unc-54 3' UTR 

3501 TCGCTACCAT TACCAACTTG TCTGGTGTCA AAAATAATAG GGGCCGCTGT 
AGCGATGGTA ATGGTTGAAC AGACCACAGT TTTTATTATC CCCGGCGACA 

unc-54 3* UTR 



3551 CATCAGAGTA AGTTTAAACT GAGTTCTACT AACTAACGAG TAATATTTAA 
GTAGTCTCAT TCAAATTTGA CTCAAGATGA TTGATTGCTC ATTATAAATT 

unc-54 3' UTR 



3601 ATTTTCAGCA TCTCGCGCCC GTGCCTCTGA CTTCTAAGTC CAATTACTCT 
TAAAAGTCGT AGAGCGCGGG CACGGAGACT GAAGATTCAG GTTAATGAGA 

unc-54 3* UTR 

3651 TCAACATCCC TACATGCTCT TTCTCCCTGT GCTCCCACCC CCTATTTTTG ' 
AGTTGTAGGG ATGTACGAGA AAGAGGGACA CGAGGGTGGG GGATAAAAAC 

unc-54 3' UTR 



3701 TTATTATCAA AAAAACTTCT TCTTAATTTC TTTGTTTTTT AGCTTCTTTT 
AATAATAGTT TTTTTGAAGA AGAATTAAAG AAACAAAAAA TCGAAGAAAA 

unc-54 3' UTR 

3751 AAGTCACCTC TAACAATGAA ATTGTGTAGA TTCAAAAATA GAATTAATTC 
TTCAGTGGAG ATTGTTACTT TAACACATCT AAGTTTTTAT CTTAATTAAG 

unc-54 3' UTR 

3801 GTAATAAAAA GTCGAAAAAA ATTGTGCTCC CTCCCCCCAT TAATAATAAT 
CATTATTTTT CAGCTTTTTT TAACACGAGG GAGGGGGGTA ATTATTATTA 

unc-54 3 1 UTR 



3851 TCTATCCCAA AATCTACACA ATGTTCTGTG TACACTTCTf ATGTTTTTTT 
AGATAGGGTT TTAGATGTGT TACAAGACAC ATGTGAAGAA TACAAAAAAA 

mnc-54 3' UTR 

3901 TACTTCTGAT AAATTTTTTT! TGAAACATCA TAGAAAAAAC CGCACACAAA 
ATGAAGACTA TTTAAAAAAA ACTTTGTAGT ATCTTTTTTG GCGTGTGTTT 

unc-54 3' UTR 



3951' ATACCTTATC ATATGTTACG TTTCAGTTTA TGACCGCAAT TTTTATTTCT 
TATGGAATAG TATACAATGC AAAGTCAAAT ACTGGCGTTA AAAATAAAGA . 
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unc-54 3 f UTR 

4 001 TCGCACGTCT GGGCCTCTCA TGACGTCAAA TCATGCTCAT CGTGAAAAAG 
AGCGTGCAGA CCCGGAGAGT ACTGCAGTTT AGTACGAGTA GCACTTTTTC 

unc-54 3' UTR 



4051 TTTTGGAGTA TTTTTGGAAT TTTTCAATCA AGTGAAAGTT TATGAAATTA 
AAAACCTCAT AAAAACCTTA AAAAGTTAGT TCACTTTCAA ATACTTTAAT 

unc-54 3» UTR 



4101 ATTTTCCTGC TTTTGCTTTT TGGGGGTTTC CCCTATTGTT TGTCAAGAGT 
TAAAAGGACG AAAACGAAAA ACCCCCAAAG GGGATAACAA ACAGTTCTCA 

' unc-54 3 ' UTR 



4151 TTCGAGGACG GCGTTTTTCT TGCTAAAATC ACAAGTATTG ATGAGCACGA 
AAGCTCCTGC CGCAAAAAGA ACGATTTTAG TGTTCATAAC TACTCGTGCT 

unc-54 3' UTR 

» 

4201 TGCAAGAAAG ATCGGAAGAA GGTTTGGGTT TGAGGCTCAG TGGAAGGTGA 
ACGTTCTTTC 1AGCCTTCTT CCAAACCCAA ACTCCGAGTC ACCTTCCACT 

unc-54 3' UTR 

4251 GTAGAAGTTG ATAATTTGAA AGTGGAGTAG TGTCTATGGG GTTTTTGCCT 
CATCTTCAAC TATTAAACTT TCACCTCATC ACAGATACCC CAAAAACGGA 

unc-54 3' UTR MSC II 



4 301 TAAATGACAG AATACATTCC CAATATACCA AACATAACTG TTTCCTACTA 
ATTTACTGTC TTATGTAAGG GTTATATGGT TTGTATTGAC AAAGGATGAT 

MSC II 

4 351 GTCGGCCGTA CGGGCCCTTT CGTCTCGCGC GTTTCGGTGA TGACGGTGAA 
CAGCCGGCAT GCCCGGGAAA GCAGAGCGCG CAAAGCCACT ACTGCCACTT 

4 4 01 AACCTCTGAC ACATGCAGCT CCCGGAGACG GTCACAGCTT GTCTGTAAGC 
TTGGAGACTG TGTACGTCGA GGGCCTCTGC CAGTGTCGAA CAGACATTCG 

4 4 51 GGATGCCGGG AGCAGACAAG CCCGTCAGGG CGCGTCAGCG GGTGTTGGCG 
CCTACGGGCC TCGTCTGTTC GGGCAGTCCC GCGCAGTCGC CCACAACCGC 



4501 GGTGTCGGGG CTGGCTTAAC TATGCGGCAT CAGAGCAGAT TGTACTGAGA 
CCACAGCCCC GACCGAATTG ATACGCCGTA GTCTCGTCTA ACATGACTCT 

4551 GTGCACCATA TGCGGTGTGA AATACCGCAC AGATGCGTAA GGAGAAAATA' 
CACGTGGTAT ACGCCACACT TTATGGCGTG TCTACGCATT CCTCTTTTAT 

4601 CCGCATCAGG CGGCCTTAAG GGCCTCGTGA TACGCCTATT TTTATAGGTT 



CJ sec • r> 1 . 
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GGCGTAGTCC GCCGGAATTC CCGGAGCACT ATGCGGATAA AAATATCCAA 

4651 AATGTCATGA TAATAATGGT TTCTTAGACG TCAGGTGGCA CTTTTCGGGG 
TT AC AG TACT ATTATTACCA AAGAATCTGC AGTCCACCGT GAAAAGCCCC 

4 701 AAATGTGCGC GGAACCCCTA TTTGTTTATT TTTCTAAATA CATTCAAATA 
TTTACACGCG CCTTGGGGAT AAACAAATAA AAAGATTTAT GTAAGTTTAT 

4751 TGTATCCGCT CATGAGACAA TAACCCTGAT AAATGCTTCA ATAATATTGA 
ACATAGGCGA GTACTCTGTT ATTGGGACTA TTTACGAAGT TATTATAACT 

amp 

4 801 AAAAGGAAGA GTATGAGTAT TCAACATTTC CGTGTCGCCC TTATTCCCTT 
TTTTCCTTCT CATACTCATA AGTTGTAAAG GCACAGCGGG AATAAGGGAA 

amp 

4 851 TTTTGCGGCA TTTTGCCTTC CTGTTTTTGC TCACCCAGAA ACGCTGGTGA 
AAAACGCCGT AAAACGGAAG GACAAAAACG AGTGGGTCTT TGCGACCACT 

amp 



4 901 AAGTAAAAGA TGCTGAAGAT CAGTTGGGTG CACGAGTGGG TTACATCGAA 
TTCATTTTCT ACGACTTCTA GTCAACCCAC GTGCTCACCC AATGTAGCTT 

amp 



4 951 CTGGATCTCA ACAGCGGTAA GATCCTTGAG AGTTTTCGCC CCGAAGAACG 
GACCTAGAGT TGTCGCCATT CTAGGAACTC TCAAAAGCGG GGCTTCTTGC 

amp 

5001 TTTTCCAATG ATGAGCACTT TTAAAGTTCT GCTATGTGGC GCGGTATTAT 
AAAAGGTTAC TACTCGTGAA AATTTCAAGA CGATACACCG CGCCATAATA 

amp ' 

5051 CCCGTATTGA CGCCGGGCAA GAGCAACTCG GTCGCCGCAT ACACTATTCT 
GGGCATAACT GCGGCCCGTT CTCGTTGAGC CAGCGGCGTA TGTGATAAGA 

amp 



5101 CAGAATGACT TGGTTGAGTA CTCACCAGTC ACAGAAAAGC ATCTTACGGA 
GTCTTACTGA ACCAACTCAT GAGTGGTCAG TGTCTTTTCG TAGAATGCCT 

i 

amp 



5151 TGGCATGACA GTAAGAGAAT TATGCAGTGC TGCCATAACC ATGAGTGATA 
- -- ACCGTACTGT CATTCTCTTA ATACGTCACG ACGGTATTGG T ACT C ACT AT - 

amp 



5201 ACACTGCGGC CAACTTACTT CTGACAACGA TCGGAGGACC GAAGGAGCTA . • 
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TGTGACGCCG GTTGAATGAA GACTGTTGCT AGCCTCCTGG CTTCCTCGAT 
amp 

5251 ACCGCTTTTT TGCACAACAT GGGGGATCAT GTAACTCGCC TTGATCGTTG 
TGGCGAAAAA ACGTGTTGTA CCCCCTAGTA CATTGAGCGG AACTAGCAAC 



amp 



5301 GGAACCGGAG CTGAATGAAG CCATACCAAA CGACGAGCGT GACACCACGA 
CCTTGGCCTC GACTTACTTC GGTATGGTTT GCTGCTCGCA CTGTGGTGCT 

amp 



5351 TGCCTGTAGC AATGGCAACA ACGTTGCGCA AACTATTAAC TGGCGAACTA 
ACGGACATCG TTACCGTTGT TGCAACGCGT TTGATAATTG ACCGCTTGAT 

amp 

5401 CTTACTCTAG CTTCCCGGCA ACAATTAATA GACTGGATGG AGGCGGATAA 
GAATGAGATC GAAGGGCCGT TGTTAATTAT CTGACCTACC TCCGCCTATT 

amp 



5451 AGTTGCAGGA CCACTTCTGC GCTCGGCCCT TCCGGCTGGC TGGTTTATTG 
, TCAACGTCCT GGTGAAGACG CGAGCCGGGA AGGCCGACCG ACCAAATAAC 

amp 



5501 CTGATAAATC TGGAGCCGGT GAGCGTGGGT CTCGCGGTAT CATTGCAGCA 
GACTATTTAG ACCTCGGCCA CTCGCACCCA GAGCGCCATA GTAACGTCGT 

amp 

5551 CTGGGGCCAG ATGGTAAGCC CTCCCGTATC GTAGTTATCT ACACGACGGG 
GACCCCGGTC TACCATTCGG GAGGGCATAG CATCAATAGA TGTGCTGCCC 



amp 



5601 GAGTCAGGCA ACTATGGATG AACGAAATAG ACAGATCGCT GAGATAGGTG 
CTCAGTCCGT TGATACCTAC TTGCTTTATC TGTCTAGCGA CTCTATCCAC 

amp 



5651 CCTCACTGAT TAAGCATTGG TAACTGTCAG ACCAAGTTTA CTCATATATA 
GGAGTGACTA ATTCGTAACC ATTGACAGTC TGGTTCAAAT GAG TAT AT AT 

5701 CTTTAGATTG ATTTAAAACT TCATTTTTAA TTTAAAAGGA TCTAGGTGAA 
GAAATCTAAC TAAATTTTGA AGTAAAAATT AAATTTTCCT AGATCCACTT 

5751 GATCCTTTTT GATAATCTCA TGACCAAAAT CCCTTAACGT GAGTTTTCGT 
CTAGGAAAAA CTATTAGAGT ACTGGTTTTA GGGAATTGCA CTCAAAAGCA 



5801 TCCACTGAGC GTCAGACCCC GTAGAAAAGA TCAAAGGATC TTCTTGAGAT 
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AGGTGACTCG CAGTCTGGGG CATCTTTTCT AGTTTCCTAG AAGAACTCTA 

5851 CCTTTTTTTC TGCGCGTAAT CTGCTGCTTG CAAACAAAAA AACCACCGCT 
GGAAAAAAAG ACGCGCATTA GACGACGAAC GTTTGTTTTT TTGGTGGCGA 

5901 ACCAGCGGTG GTTTGTTTGC CGGATCAAGA GCTACCAACT CTTTTTCCGA 
TGGTCGCCAC CAAACAAACG GCCTAGTTCT CGATGGTTGA GAAAAAGGCT 

5951 AGGTAACTGG CTTCAGCAGA GCGCAGATAC CAAATACTGT CCTTCTAGTG 
. TCCATTGACC GAAGTCGTCT CGCGTCTATG GTTTATGACA GGAAGATCAC 

6001 TAGCCGTAGT TAGGCCACCA CTTCAAGAAC TCTGTAGCAC CGCCTACATA 
ATCGGCATCA ATCCGGTGGT GAAGTTCTTG AGACATCGTG GCGGATGTAT 

6051 CCTCGCTCTG CTAATCCTGT TACCAGTGGC TGCTGCCAGT GGCGATAAGT 
GGAGCGAGAC GATTAGGACA ATGGTCACCG ACGACGGTCA CCGCTATTCA 

6101 CGTGTCTTAC CGGGTTGGAC TCAAGACGAT AGTTACCGGA TAAGGCGCAG 
GCACAGAATG GCCCAACCTG AGTTCTGCTA TCAATGGCCT ATTCCGCGTC 

6151 CGGTCGGGCT GAACGGGGGG TTCGTGCACA CAGCCCAGCT TGGAGCGAAC 
GCCAGCCCGA CTTGCCCCCC AAGCACGTGT GTCGGGTCGA ACCTCGCTTG^ 

6201 GACCTACACC GAACTGAGAT ACCTACAGCG TGAGCATTGA GAAAGCGCCA 
CTGGATGTGG CTTGACTCTA TGGATGTCGC ACTCGTAACT CTTTCGCGGT 

6251 ' CGCTTCCCGA AGGGAGAAAG GCGGACAGGT ATCCGGTAAG CGGCAGGGTC 
GCGAAGGGCT TCCCTCTTTC CGCCTGTCCA TAGGCCATTC GCCGTCCCAG 

6301 GGAACAGGAG AGCGCACGAG GGAGCTTCCA GGGGGAAACG CCTGGTATCT 
CCTTGTCCTC TCGCGTGCTC CCTCGAAGGT CCCCCTTTGC GGACCATAGA 

6351 TTATAGTCCT GTCGGGTTTC GCCACCTCTG ACTTGAGCGT CGATTTTTGT 
AATATCAGGA CAGCCCAAAG CGGTGGAGAC TGAACTCGCA GCTAAAAACA 

6401 GATGCTCGTC AGGGGGGCGG AGCCTATGGA AAAACGCCAG CAACGCGGCC 
CTACGAGCAG TCCCCCCGCC TCGGATACCT TTTTGCGGTC GTTGCGCCGG 

6451 TTTTTACGGT TCCTGGCCTT TTGCTGGCCT TTTGCTCACA TGTTCTTTCC 
AAAAATGCCA AGGACCGGAA AACGACCGGA AAACGAGTGT ACAAGAAAGG 

6501 TGCGTTATCC CCTGATTCTG TGGATAACCG TATTACCGCC TTTGAGTGAG 
ACGCAATAGG GGACTAAGAC ACCTATTGGC ATAATGGCGG AAACTCACTC 

6551 CTGATACCGC TCGCCGCAGC CGAACGACCG AGCGCAGCGA GTCAGTGAGC 
GACTATGGCG AGCGGCGTCG GCTTGCTGGC TCGCGTCGCT CAGTCACTCG 

6601 GAGGAAGCGG AAGAGCGCCC AATACGCAAA CCGCCTCTCC CCGCGCGTTG 
CTCCTTCGCC TTCTCGCGGG TTATGCGTTT GGCGGAGAGG GGCGCGCAAC 

6651 GCCGATTCAT TAATGCAGCT GGCACGACAG GTTTCCCGAC TGGAAAGCGG 
CGGCTAAGTA ATTACGTCGA CCGTGCTGTC CAAAGGGCTG ACCTTTCGCC 



6701- GCAGTGAGCG CAACGCAATT AATGTGAGTT AGCTCACTCA TTAGGCACCC 
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CGTCACTCGC GTTGCGTTAA TTACACTCAA TCGAGTGAGT AATCCGTGGG 

6751 CAGGCTTTAC ACTTTATGCT TCCGGCTCGT ATGTTGTGTG GAATTGTGAG 
GTCCGAAATG TGAAATACGA AGGCCGAGCA TACAACACAC CTTAACACTC 

680i CGGATAACAA TTTCACACAG GAAACAGCTA TGACCATGAT TACGCCAAGC 
GCCTATTGTT AAAGTGTGTC CTTTGTCGAT ACTGGTACTA ATGCGGTTCG 

6851 TGTAAGTTTA AACATGATCT TACTAACTAA CTATTCTCAT TTAAATTTTC 
ACATTCAAAT TTGTACTAGA ATGATTGATT GATAAGAGTA AATTTAAAAG 

6901 AGAGCTTAAA AATGGCTGAA ATCACTCACA ACGATGGATA CGCTAACAAC 
TCTCGAATTT TTACCGACTT TAGTGAGTGT TGCTACCTAT GCGATTGTTG 

PstI 

6951 TTGGAAATGA AATAAGCTTG CATGCCTGCA 
AACCTTTACT TTATTCGAAC GTACGGACGT 
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Figure 23 
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